CLOUD COMPUTING NOTES
CLOUD DATA MIGRATION:
Cloud Migration is the process of moving digital business operation into the cloud. Cloud migration is short of like a physical move except it involves moving data, application and IT process from some data center to other data center instead of packing up and moving physical goods. Most often cloud migration describes the move from on premises or legacy( old but still in use) infrastructure to the cloud.
WHAT IS LEGACY INFRASTRUCTURE ?
In computing, Hardware or Software is considered legacy if it is outdated but still in use. Legacy product and process are usually not as efficient or secure as more up to date solution. Businesses stuck running legacy system are in danger of falling behind their competitors, they also face an Increased risk of data breaches.
WHAT CLOUD MIGRATION STRATEGY SHOULD ENTERPRISE MAY ADOPT ?
i) Rehost
ii) Refactor
iii) Revise
iv) Rebuild
v) Replace
These five strategies are commonly called 5R.
i) Rehost: Rehosting can be thought of as the same thing but on cloud servers. Companies that choose that strategy will select an infrastructure-as-a-service provider and recreate their application architecture on that infrastructure.
ii) Refactor: Companies that choose to refactor will reuse already existing codes and frameworks. But run their application on a platform-as-a-service providers' platform. Instead of an infrastructure as a service, as in rehosting.
iii) Revise: The strategy involves partially rewriting (expanding) the code base, then deploying it by either re hosting or refactoring.
iv) Rebuild: To rebuild means rewritting and rearchitecting the application from the ground up on a paas provider platform.
v) Replace: Business can also opportunity to discard their old application all together and switch to already build software-as-a-service application form the third party vender.
**************************************
In distributed architecture components are presented in different platform and several component can co operate with one another over a communication network in order to achieve a specific objective or goal. In this architecture information processing is not confine in a single machine rather than it is distributed over several independent computer.
There are several technology framework to support distributed architecture including .Net, Dot Net web services, Access Java web services, and globus glue services.
Middleman is a infrastructure that appropriately support the development and execution of distributed application. It provides a buffer between an application and the network.
Some advantage of Distributed Architecture:
1. Resource sharing
2. Concurrency
3. Scalability
4. Fault tolerance
Disadvantages of Distributed Architecture:
1. Complexity
2. Security
3. Manageability
Client-Server Architecture:
Client: This is a first process issue a request to a second process.
Server: This is a second process that receive the request carries it out and send a reply to the client. Using this architecture the server need not know about client but the client must know the identity of server.
VIRTUALIZATION CONCEPT:
Creating a virtual machine over existing operating system and hardware is refered as hardware virtualization.
Virtual machine provides an environment that is logically seperated from the underlying hardware.
The machine on which the virtual machine is created is known as HOST machine and virtual machine is refered as a GUEST machine.
This virtual machine is managed by a software or frameware. Which is known as Hypervisor.
There are two type of Hypervisor, 1> TYPE 1 HYPERVISOR and 2> TYPE 2 HYPERVISOR
TYPE 1 HYPERVISOR:
It execute on bare(physical) system. ORACLE VM, Virtual Logic, VLX are type of Hypervisor.
TYPE 2 HYPERVISOR:
It is a software interface that emulates the devices with which a system normally interacts. Containers, KVM, Microsoft HyperV, are example of type 2 Hypervisor.
TYPES OF HARDWARE VIRTUALIZATION:
1> Full virtalization
2> Emulation virtualization
3> Para Virtualization
1> FULL VIRTUALIZATION :
In full vir. the underlying hardware is completely simulated. Guest software does not require any modification program.
2> EMULATION VIRTUALIZATION
In emulation,the virtual machine simulates the hardware and hence became independent of it. In this the Guest OS does not require modification.
3> PARA VIRTUALIZATION
In para virtualization, The hardware is not simulated the Guest software runs their own Isolated domain.
**************************************
VIRTUAL CPU:
When we install a hypervisor, each physical CPU in abstracted into virtual CPUs. It divides the available CPU cycle, for each core, and allow multiple vn to time share a given physical processor core. The hyper visor critically assign 1 workload per virtual CPU. If the work load of a server need more CPU cycles, It is better to deploy fewer VN on a perticular virtual CP. Now, following example, to understand the logic of virtual CPU.
.... pic..
I have a physical server with 2 processor (Cpu 1 and Cpu2) and each of them physical port in total we have 2 * 4 = 8 physical core. Based on some calculation, our hyper visor provided for each physical core you can get 5 to 10 virtual cpu. So, total we will have 8 physical core * (5 to 10 vCPU) = 40 to 80 vCPU.
which means that we have maximum of 80 vCPU to Virtual machine.
VIRTUAL MEMORY:
Virtual memory is a simple work in the Ram with the machine. The memory resource getting a virtual machine determines, how much the host memory is allocated to the virtual machine.The virtual hardware memory size determines how much memory is available to application that run in the virtual machine. You can add, change and configure virtual memory resource or option to enhance virtual machine performance. You can say most of the memory parameter while creating the virtual machine or it can also be done after the guest OS is installed. Most of the hypervisor require to power off the virtual machine before changing these settings.
...pic...
In the following diagram, we can see the total physical memory is divided between two virtual machines.
VIRTUAL STORAGE:
Storage Virtualization is a pooling of physical storage from multiple network storage devices, Into what? appears to be a single storage device that is managed from a central console. We cannot assign more storage to virtual machine that data cluster offer physically. In the following example, we have a data cluster of 12TB in total and 4VM to which we have allocated storage to each of them. In total, the maximum storage is allocated to them is only 12TB.
VIRTUAL NETWORKING:
We have VM 1,2,3 and 4 running on same host. They would like to send the network traffic that and forth. This is done by Virtual networking card as shown in the following diagram which connect virtually with a virtual switch that is created by the hypervisor. This virtual switch communicates with a physical NIC card of the server which is connected with a physical switch and then communicates the rest of the network equipment.
IMPORTANT QUESTIONS: 5 MARKS
1> WHAT ARE THE LAYERS AND TYPES OF CLOUD?
2> WRITE NOTE OF DESIRE FEATURES OF A CLOUD.
3> STATE ESSENTIAL CHARACTERISTIC OF CLOUD COMPUTING.
4> EXPLAIN IN DETAIL ABOUT CLOUD DELIVERY MODEL.
5> DISCUSS THE OPERATIONAL AND ECONOMIC BENEFIT OF SAC(SOFTWARE AS-A SERVICE).
6> WHAT ARE THE SECURITY CONSTRAIN IN CLOUD COMPUTING?
7> WRITE NOTES ON GREED GRID AND CLOUD.
8> HOW DO YOU IMPLEMENT THE HYBRID CLOUD.
9> DEFINE VIRTUAL CPU.
10> DEFINE VIRTUAL MEMORY
11> VIRTUAL STORAGE
12> VIRTUAL NETWORK
13> WRITE A SHORT NOTE ON SOFTWARE VIRTUALIZATION AND NETWORK VIRTUALIZATION.
14> WHAT IS MEAN BY MIGRATING IN CLOUD COMPUTING? WHY WE USE MIGRATING?
15> DISCUSS THE ADVANTAGES OF SOA (SERVICE-ORIENTED-ARCHITECTURE).
16> WRITE SHORT NOTES ON VIRTUALIZATION AND GOOGLE APP ENGINE.
17> WRITE SHORT NOTES HADOOP MAP REDUCE.
18> HIGH PERFORMANCE COMPUTING DEFINE HPC.
19> DIFFERENCE BETWEEN GREED AND CLOUD COMPUTING.
20> DIFFERENCE BETWEEN DISTRIBUTED AND PARALLEL COMPUTING.
21> WRITE A SHORT NOTE OF ORIGIN OF CLOUD COMPUTING.(HISTORY)
22> HOW CLOUD STORAGE IS DIFFERENT FROM ON PREMISE DATA CENTER?
Simplicity, scalability, maintenance, and accessibility of data are the features which we expect from any public cloud storage and these are main asset of .... and which is very difficult on premise data center
SIMPLICITY -
We can easily create an setup storage object in google or azure cloud.
SCALABILITY -
Storage capacity is highly scalable and elastic.
MAINTENANCE, ACCESSIBILITY & BACKUP -
Data in Azure or Google Storage is searchable and accessable through the latest web technologies like HTTP, and rest API.
Multi protocol (HTTP, TCP, etc) data access for modern application makes azure or google who standing the cloud.
Not required to bother about maintenance of data center and backup of data everything, will be of the cloud provider team.
Azure or Google replication concept is used to maintain the different copies of data and different geo location. Using this we can protect our data even if a natural disaster occur.
High availability and adjuster
recovery are one of the good features provided by cloud storage which we
cannot see on promise data center.
[same IP address series, same domain, network address ]
23> DESCRIBE ABOUT CLUSTER COMPUTING?
=> Cluster computing, it's a group of computers connected to each other and work together as a single computer. These computers are often link through a LAN. The cluster came to existence for the high need for them, because the computing requirement are increasing in a high rate and there's more data to process. So the cluster has been use widely in d.... The cluster is a tightly coupled system and from it's characteristic that it is a centralized job management and scheduling system. All the computers in the cluster use the same hardware and OS and the computers are the same location and connected with a very high speed connection to perform as a single computer. The resources of the cluster are managed by Centralized resource manager. The cluster is a single owned to only one organization or department. In a high-end with low latency and high bandwidth , the security in the cluster is a login password based and it has a medium level of privacy depends on user privilege.
....
Architecture of Cluster Computing:
The architecture of cluster computing contains some main components like i) multiple standalone computers. ii) Operating system. iii) High performance inter connect iv) Communication software and v) application platform.
Advantages:
In the Cluster software, Is automatically installed and configured and the nodes of the cluster can be added and managed easily. So, It is a very easy to deploy, it is a open system and very cost-effective to acquire and manage, cluster have many sources of support and supply, It is fast and very flexible, the system is optimized for performance as well as simplicity and it can change software configuration at any time. Also It saves the time of searching the net for the latest drivers. The cluster system is very supportive as it includes software update.
Disadvantage:
Cluster computing contains some disadvantages such as that it is hard to be managed with out experience. Also when the size of the cluster is large (node) it will be difficult to find out something has failed, the programming environment is hard to be improved when software of some node is different from other.
*******************************************************************
WHAT IS DATA CENTER ?
All the data center are essentially building that provides space, power and cooling for network infrastructure. A data center design is based on a network of computing and storage resource that enables the delivery of shared application and data. the key component of a data center design include router, switch, firewall, storage system, server and application delivery controller.
WHY ARE DATA CENTER IMPORTANT TO BUSINESS ?
Data center are integral part of the enterprise, design to support business application and provide services such as,
1> Data storage and management and backup and recovery.
2> Productivity application such as email service,
3> High volume E-commerce transaction.
4> Powering , online gaming community.
5> Big-data, Machine learning and artificial intelligence based application.
Today there are 7 million data center world wide, practically every business and government entity build and maintain it's own data center or has access to someone else, if not both model. Many options are available today such as renting server, at a poor location facility, maintaining data center service by third party. or using public cloud based service like Amazon, Microsoft, Google etc.
The Core component of a data center :
The primary element of data center break down as follows:
1> Facility:
The usable phase available for IT equipment. Providing round the clock access to information make data center some of the world most energetic consuming facility. Design to optimize phase and environmental control to kepp equipment within specific temperature -humidity range.
2> Core component:
Equipment and software for IT operation and storage of data and application. this may include storage system , server, network infrastructure such as switch, router, firewall, load balancer etc.
3> Support Infrastructure:
Equipment contributing to securely sustaining the highest availability procedures. The Up time institute has define four tyres of data center with availability ranging from 99.671% to 99.995%.
Some component for supporting infrastructure include
a) UPS : Battery bank generator and redundant power source.
b) Environmental Control: Computer room air-conditional (E_RAC), Sitting ventilation and air conditioning system (HVAC).
c) Physical Security system: Bio metric and video surveillance system.
d) Operation task: Personal available to monitor operating and maintain IT and infrastructure around the clock.
Q> WHAT ARE THE STANDARD FOR DATA -CENTER INFRASTRUCTURE?
The most widely adoptive data center design in ANSI/TIA-942. It include standard for ANSI/TIA-942 ready certification. Which ensure compliance with one or four category of data center tier. Rated for level of redundancy and fault tolerance.
TIER -1 :
Basic size infrastructure, Tier 1 data center occur limited protection against physical event. It has single capacity component and a single non redundant distribution path.
TIER -2:
Redundant capacity component size infrastructure, this data center offer improved protection against physical events. It has redundant capacity component and a single non-redundant distribution part.
TIER -3:
Concurrently maintainable size infrastructure , this data center protects against virtually or physical event , providing a redundant components and multiple independent distribution path. Each component can be removed or replaced without disturb the services to end user.
TIER -4:
Fault tolerant size infrastructure, this data center provides highest level of fault tolerance and redundancy. Redundancy capacity component and multiple independent distribution path enable concurrent maintainability and one fault anywhere in the installation without causing down time.
Q > TYPES OF DATA CENTER
Many types of data center and service models are available. There classification depends on whether they are owned by one or many organizations, what technology they use for computing and storage, and even their energy efficiency. There are four types of data center:
Hyper scale data center:
Co-located data center:
Wholesale Co-location data center:
Enterprise data center:
There are another one, i.e.
Cloud data center:
Communication (HTTPS security, firewall) , Application(tier authentication,server load balance, database SSL certificate), Storage security( encryption algo, ) ,
*******************************************************
MARKS: 5
1> EXPLAIN HADOOP CORE COMPONENTS ?
2> STABILITY OF A TWO LEVEL RESOURCE ALLOCATION ARCHITECTURE.
3> EXPLAIN MODERN IMPLEMENTATION OF SAAS (SOFTWARE AS A SERVICE) USING SOA COMPONENTS.
4> DEFINE HYPERVISOR AND IT'S TYPE IN CLOUD COMPUTING WITH DIAGRAM.
5> LIST OUT THE TOP 10 OBSTACLES AND OPPORTUNITIES FOR ADOPTION AND GROWTH OF CLOUD COMPUTING.
6> EXPLAIN THE WORKING OF BROKERED CLOUD STORAGE ACCESS SYSTEM WITH DIAGRAM.
7> DESCRIBE USE OF EC2(AMAZON ELASTIC) TO SERVICE IN AWS OR AMAZON PLATFORM
8> ANALYSES THE REASON OF INTRODUCING STORAGE AREA NETWORK (SAN).
9> WHAT IS CLOUD BASED STORAGE. EXPLAIN MANAGED AND UNMANAGED CLOUD STORAGE WITH EXAMPLE.
10> BRIEFLY DESCRIBE OPEN STACK WITH ITS APPLICATION.
MARKS :10
1> EXPLAIN BRIEFLY THE SECURITY CONCERN OF CLOUD COMPUTING.
2> DISCUSS THE OPERATIONAL AND ECONOMICAL BENEFITS OF SAAS.
3> EXPLAIN DEPLOYMENT MODEL OF CLOUD IN DETAIL.
4> EXPLAIN VIRTUAL MACHINE SECURITY IN CLOUD COMPUTING .
5> DESCRIBE IN DETAIL ABOUT PROVIDER DATA AND IT'S SECURITY.
6> DEFINE SERVICE ORIENTED ARCHITECTURE. EXPLAIN COMPONENT OF SERVICE ORIENTED ARCHITECTURE.
7> DISCUSS THE KEY PRINCIPLE OF SERVICE ORIENTED ARCHITECTURE.
8> DEFINE VIRTUALIZATION. WHAT IS THE NEED OF VIRTUALIZATION IN CLOUD COMPUTING.
9> WRITE SHORT NOTES ON AMAZON S3(STATIC STORAGE) AND AMAZON SIMPLE DB.
10> VIRTUALIZATION OF CPU , MEMORY AND INPUT OUTPUT DEVICE.
11> EXPLAIN THE GFS (GOOGLE FILE SYSTEM) CLUSTER ARCHITECTURE WITH SUITABLE BLOCK DIAGRAM.
12> EXPLAIN THE VIRTUALIZATION FOR DATA CENTER AUTOMATION.
13> EXPLAIN THE CHALLENGES AND LEGAL ISSUSES OF CLOUD COMPUTING.
14> WHAT ARE THE ROLE OF WEB SERVICES IN CLOUD COMPUTING.
15> WHAT IS A DATA CENTER.
16> DEFINE THE CORE COMPONENTS AND STANDARD OF A DATA CENTER.
17> EXPLAIN WITH DIAGRAM HDFS (HADDOF FILE SYSTEM) ARCHITECTURE.
18> EXPLAIN CLOUD INFRASTRUCTURE SECURITY AT APPLICATION LEVEL.
19> DISCUSS SCHEDULING ALGORITHM FOR CLOUD COMPUTING.
20> WRITE SHORT NOTES ON BROKER CLOUD STORAGE ACCESS AND STORAGE LOCATION AND TENANCY.
21> DESCRIBE ABOUT CLUSTER COMPUTING.
************************************************************************
12> EXPLAIN THE VIRTUALIZATION FOR DATA CENTER AUTOMATION.
The dynamic nature of cloud computing, has pushed data center work load, server and even hardware automation to whole new level. Now, any data center provider looking to get into cloud computing, must look at some form of automation to help them be as agile as possible in the cloud world. New technologies are forcing data center providers to adopt new methods to increase efficiency, scalability and redundancy. There are numerous big trends increased use of data center facility. the trends are More Device, More user, More Cloud, More work load and a lot of More Data.
The automation layers are given below,
a) Server layer:
Server, hardware automation had come a long way. Administrator only need to deploy one server profile and allow new server to pick up those settings. More data center are trying to get into the cloud business. This means deploying high density, fast provision, server and blades(high end server is blade server). With the on demand nature of the cloud being able to quickly deploy fully configured server is a big plus for staying agile and very protective.
b) Software layer:
Entire application can be automated and provision based on the uses and resource utilization. Using the latest load balancing tools (i.e. traffic balancing) (F5 load balancing) administrator are able to say threshold for key application running within the environment. If a load balancer F5 or a netscaler (citrix netscaler) for example, see that a certain type of application is receiving too many connections, It can set off a process that you allow the administrator to provision another instance of the application or a new server which will host the air.
c) Virtual layer:
The modern data center is now full of virtualization and virtual machine. In using solution like Cirtix presentation server, administrator are able to take work load provisioning to a hole new level. Imagine being able to set a process that will kick start the creation of new virtual server when one starts to get over utilized. Now administrator can create truly automated virtual machine environment where each work load is monitored, managed, and controlled.
d) Cloud layer:
This is a new and still emerging field. Still some very large organization are already deploying technology like Open Stack, Cloud Stack, Open Nebula. Furthermore, they are trying this platforms in with big data management solution like map reduce and Hadoop. Organization can deploy distributed data center and had the entire cloud layer managed by a cloud control software platform. Engineer are able to monitor workloads, How data is being distributed and the health of the cloud infrastructure. (gossiping protocol)(Cassandra database protocol). The great part about this technology is that organization can deploy a true private cloud with as much control and redundancy as a public cloud instance.
e) Data center layer:
Although entire data center automation technologies are not quiet here yet. We are saying more robotics appear with in the data center environment. So, robotic arms already control massive tape library for google and robotic automation, is a thoroughly discussed concept among other large data center provider. In working with modern data center technologies, administrator strive to be as efficient as possible, this means deploying new types of automation solutions which span the entire technology stack.
**********************************************
5> DISCUSS THE OPERATIONAL AND ECONOMIC BENEFIT OF SAAS(SOFTWARE AS-A SERVICE).
OPERATIONAL BENEFIT OF SAAS :
a) Managing business driven IT project
A SAAS model provide a necessary infrastructure and thus leads to technology projects that address true business need.
b) Increasing consumer demand -
SAAS model provides reliability to deliver near perfect 99.99% system availability. So any number of user can access the system at any time from any where.
c) Addressing growth -
Each model provide scalability that is easily supported by an increasing number of consumer to meet there own objective.
d) Servicing New Market Quickly and Easily -
SAAS allow the organization to quickly and easily add program so as to add up the changes based on the demand at a faster rate.
e) On Demand -
The solution is self-served and available for use as needed.
f) Scalable -
It allows for the infinite scalability and quick processing time.
ECONOMIC BENEFIT OF SAAS :
SAAS not only save time but also has greater financial benefit.
a) It reduces IT expenses
b) The implementation cost of SAAS is much lower than traditional software.
c) It redirects saving expenses towards businesses improvement by utilizing SAAS we are free to use as much of any software as we needed. These gives you easy and economical access to many programs
d) SAAS vendor release upgrade for their software, thus user need not put any effort into installing and upgrading the software.
e) Another main benefit in SAAS is that it can quickly and easily be accessed from any where by using a latest web browser.
capex and opex
Rajesh Bose Books: https://www.amazon.in/Computers-Internet-Rajesh-Bose-Books/s?rh=n%3A1318105031%2Cp_27%3ARajesh+Bose
*************************************************
1. Differences between Public, Private, Hybrid and Community cloud.
Feature | Public | Private | Hybrid | Community | |
---|---|---|---|---|---|
Host | Service Provider | Enterprise | Enterprise | Community or Third-Party | |
Suitable for | Large Enterprise | Large Enterprise | Small / Middle Enterprise | Financial, Health, Legal company |
|
Access | Internet | Intranet, VPN | Intranet, VPN | Intranet, VPN | |
Security | Low | Most secure | Moderate | Secured |
|
Cost | Cheapest | High Cost | Cost-Effective | Cost-Effective | |
Owner | Service Provider | Enterprise | Enterprise | Community |
|
Users | Organization, Public like individuals | Business Organization | Business Organization | Community Member |
|
Reliability | Moderate | Very High | Medium To high | Very High | |
Scalability | Very High | Limited | Very High | Limited |
2. Service Model
Software-as-a-Service: There are no CAPEX and OPEX concept.
Infrastructure-as-a-Service: There are only Operational Cost needed from the client side.
Platform-as-a-Service: There are only Operational cost needed from the client side.
Packaged Software : There are Capital investments and operational costs involved from the client side.
In CAPEX, there are high risk when migration is done, because Server and Storage cost is provided by Company. Cost of server and storage etc devices are given by Company where cooling, electricity etc provided by Cloud provider.
In OPEX, there are no need to wait in migration time. Any time, Company can change their Cloud providers.
CLOUD MANAGEMENT TASK
Cloud management means management of cloud data using some tools i.e. protocols and monitors all services in every second.
AUDIT SYSTEM BACKUPS:
IT Audit -
Information Technology Audit
Income Tax Audit
For example Audit Company Names: PricewaterhouseCoopers PWC, Harness- Yon. Those companies are used for Audit checking purposes.
Performance based on share market. If a company has highest share then they can be a reputed one.
Backup : Audit company looks for backup in a random order must be both ways i.e. Online backup and offline backup. In public cloud, Backup can be done anywhere. In private cloud, Backup can be done where and in which order all are maintained in SLA.
DATA FLOW OF THE SYSTEM :
SERVER HBA CARD FOR PORT system in order to service all the time.
BEWARE OF VENDOR LOCK-IN :
MIGRATION : Time, Bandwidth, TB etc rules and regulations. How to handover the data to the client. Google to Amazon switching data. Company pays for that migration.
KNOWING PROVIDER SECURITY PROCEDURE:
MONITOR CAPACITY PLANNING :
MONITOR AUDIT LOG :
EVERY LOG ABOUT MACHINES
SOLUTION TESTING AND VALIDATION:
Upcoming Client list, Prototype model is already build and client require that, Company provide that for future.
...(note)
ADMINISTRATING FEATURES OF CLOUD
Resource administration:
* Resource Configuration
* Security Enforcement
* Operations monitoring
* Provisioning of resources
* Management of policies
* Performance maintenance
* Performance optimizing
Based on some protocol and scheduling algorithm are run in cloud like First come First serve, Round Robin, Min-Min algorithm(shortest from short tasks), Max-Min algorithm(Highest workload is given first). There are only one protocol in one Load balancer.
The above picture is co-related to the given below diagram.
This is a 3 tier architecture i,e. DataCenter, V, and Database where user comes Data center.
There are server load balancer and link load balancer.
User enter VMs. Which user enters which VM, that is decided by server load balancer. There are some protocols and then using some scheduling algorithm, you can get in the VM. User authenticate and enter in VM and where data stores in DB. By authentication you can enter the database. v1, v2 and v3 has same application. Where user enters at the VMs.
There are no shortest path to enter the server because there are 1000 of users to enter that can be handled by BGP(Boader Gateway protocol).
** Cloud Router uses Border Gateway Protocol (BGP) to exchange routes between your Virtual Private Cloud (VPC) network and your on-premises network. On Cloud Router, you configure an interface and a BGP peer for your on-premises router. The interface and BGP peer configuration together form a BGP session.
HOP :
1. INSTANCE STORAGE
2. VOLUME STORAGE
3. OBJECT STORAGE
INSTANCE STORAGE OR VIRTUAL DISK IN THE CLOUD
In a traditional virtualized environment, the virtual disk storage model is the eminent one. Basically, The meaning of the storage is used like a conventional virtual storage. This storage can be implemented in numerous way. For example, DAS (direct access storage) is generally used to implement instance storage.
VOLUME STORAGE OR SAN (STORAGE AREA NETWORK)
Volume storage is also known as Block storage. It supports operation like read, write and keeping the system files of running virtual machine.
As suggested by it's name Data is stored in the structured block and volume. where the files are split in to equal size block. Each block has it's own address.
Input-Output Operation Per Second(IOOPS) :
Backend IOOPS(home usage, data download) is greater than Frontend IOOPS (server storage, data upload), Data fetch can easily be done. For storage functionality we can check IOOPS. Storage Administrator handles this.
OBJECT STORAGE OR NAS (Network attached storage)
Cloud native applocation needs space for storing data that is shared between different VMs. However often, there is a need for space that can be extend to various data center across multiple geographics which is created by object storage. For example, Amazon simple storage service (S3) cararas through a single space across at entire region. Object storage data as object unlike others which go for a file hierarchy system. Now each object consist of data, meta data, and a unique identifier. Object storage, also saves a substaintial amount of unstructured data. These kiind of storage is used for storing songs or audio applications, photos on social media or online services like dropbox.
How do you implement the hybrid cloud?
models (private, соmmunitу, оr рubliс) thаt remain uniquе еntitiеѕ, but are bound
tоgеthеr bу ѕtаndаrdizеd оr proprietary tесhnоlоgу thаt еnаblеѕ data аnd application
роrtаbilitу (е.g., сlоud bursting for load balancing between clouds).
Lаrgе роrtiоnѕ оf аgеnсiеѕ thаt hаvе already ѕwitсhеd ѕоmе рrосеѕѕеѕ оvеr tо сlоud
based computing solutions hаvе utilizеd hуbrid сlоud options. Fеw еntеrрriѕеѕ hаvе
the ability tо ѕwitсh over аll оf thеir IT ѕеrviсеѕ аt оnе timе, the hybrid орtiоn allows
fоr a mix оf оn bаѕе and сlоud options which рrоvidе аn easier trаnѕitiоn. NASA iѕ
оnе example оf a federal аgеnсу whо is utilizing the Hybrid Cloud Computing
dерlоуmеnt model. Its Nеbulа open-source сlоud computing project uѕеѕ a рrivаtе
сlоud fоr rеѕеаrсh аnd dеvеlорmеnt as well as a рubliс сlоud tо shared dаtаѕеtѕ with
external раrtnеrѕ and thе рubliс. Thе hуbrid сlоud соmрuting deployment model
option has аlѕо рrоvеn tо be thе сhоiсе option for ѕtаtе аnd lосаl gоvеrnmеntѕ аѕ wеll,
with states likе Miсhigаn аnd Cоlоrаdо hаving аlrеаdу declared thеir cloud
соmрuting intentions with рlаnѕ illuѕtrаting hуbrid сlоud deployment models.
Write short notes on (i) Software virtualization (ii) network virtualization.
One of the most widely used software virtualization is Software Virtualization
Solution (SVS) which is developed by Altris. It is similar to hardware which is
simulated as virtual machines. Software virtualization involves creating a virtual layer
or virtual hard drive space where applications can be installed. From this virtual
space, the application can be run as they have been installed onto host OS. Once user
finished using application, they can switch it off. When a application is switched off,
any changes that the application made to the host OS will be completely reversed.
This means that registry entries and installation directories will have no trace of the
application being installed, executed at all. Benefits of software virtualization are,
The ability to run applications without making permanent registry or library changes.
The ability to run multiple versions of the same application. The ability to install
applications that would otherwise conflict with each other.
(ii) Network virtualization
Network virtualization is the process of combining hardware and software network
resources and network functionality into a single, software based administrative entity
which is said to be virtual network. Network virtualization involves platform
virtualization. Network virtualization is categories into external network virtualization
and internal network virtualization.
External network virtualization is combining of many networks into a virtual unit.
Internal network virtualization is providing network like functionality to the software
containers on a single system. Network virtualization enables connections between
applications, services, dependencies and end users to be accurately emulated in the
test environment.
1. Fault Tolerance
2. Throughput
:::: HADOOP ::::::
::::: WRITE AND READ OPERATION IN HDFS :::::
WRITE OPERATION in HDFS :
1 & 2. CREATE : A client initiate right operation by calling create (method )of distributed file system object which creates a new file.
Distributed file system object connects to the Namenode using RPC call and initiate a new file creation.
3. WRITE: Once a new record in Namenode is created and object of type FSDataOutputStream is returned to the client. A client uses it to write data in to HDFS.
4. DATA QUEUE : FSDataOutputStream contains DFSOutputStream object which looks after communication with DataNodes and Namenodes.
5. DATA STREAMER : There is a one more component call data streamer. Which consume this data queue. Data streamer also ask Namenode for allocation of new blocks they are by picking desirable DataNodes to be used for replication.
6. DATASTORE PIPELINE: now the process of replication starts by creating a pipeline of DataNodes. In our case, We have chosen aplication level of 3 and hence there are three dataNode in pipeline.
7: The data streamer Cores packet into first DataNode into the pipeline.
8 : Every data node in a pipeline stored packet received by it and forward the same to the second DataNode in a pipeline.
9: another queue ACK queue is maintained by DFSOutputStream to store packet which are waiting for acknowledgement from DataNodes.
10: Sending Acknowledge packet : Once acknowledged for a packet in the queue is received from all data nodes in the pipeline, It is removed from the ack Queue(9). In the event of any data node failure packets from this queue are used to reinitiate the operation.
11. After a client is done with the writing data, It calls close method. Result in to flashing remaining data packets to the pipeline followed by waiting of acknowledgement.
12. Once a Final acknowledge is received NameNode is connected to tell that the file write operation is complete.
READ OPERATION in HDFS:
............ GPFS ................
1. File system node: coordinate administrative tasks
2. Manager node : it exist one per file system. Different manager nodes are global lock manager, local lock manager, allocation manager etc.
3. Storage node : it implement shared access to files, coordinate with manager node during recovery and facilitated both file data and meta data both to be striped across multiple storage nodes.
Some other auxiliary nodes include..
1. Meta node - a node is dynamically elected as meta node for centralized management for file meta data. Election of meta node is facilitated by token server.
2. Token server : a token server tracks all tokens granted to all nodes in the cluster.
A token granting algorithm is used to reduce cost of token management.
What is GPFS used for...
IBM GPFS is a file system used to distribute and manage data across multiple server and is implemented in many high performance computing and large scale storage environments. Gpfs is among the leading file system for high performance computing application.
How does parallel file system works...
A parallel file system breaks up a data set and distributes or strives, the blocks to multiple storage drives. Which can be located in the local or remote servers. User do not need to know the physical location of the data blocks to retrieve a file. The system uses a global namespace to facilitated data access.
What is gpfs file system linux?
The gpfs is a high performance clustered file system developed by IBM. It allows to configure a high available file system allowing concurrent access from a cluster of nodes. Cluster nodes can be server using AIX, LINUX OPERATING SYSTEM.
How general parallel file system gpfs works in cluster systems.
The gpfs is a cluster file system this means that it provides concurrent access to a single file system or set of file system from multiple nodes. Multiple gpfs clusters can share data within a location or accross wide area network connection. The gpfs disk data structure support file system with upto 4096 disk of upto 1tb in size each for a total of 4petabytes per file system.
Comments
Post a Comment