Wednesday 28 April 2021

AWS Migration & Transfer

 

AWS Migration & Transfer

AWS Migration Hub

AWS Migration Hub provides a single location to track the progress of application migrations across multiple AWS and partner solutions.

AWS Migration Hub allows you to either import information about on-premises servers and applications, or to perform a deeper discovery using our AWS Discovery Agent or AWS Discovery Collector, an agentless approach for VMware environments.

AWS Migration Hub network visualization allows you to accelerate migration planning by quickly identifying servers and their dependencies, identifying the role of a server, and grouping servers into applications.

To use network visualization, first install AWS Discovery agents and start data collection from the Data Collectors page.

AWS Migration Hub provides all application details in a central location.

This allows you to track the status of all the moving parts across all migrations, making it easier to view overall migration progress and reducing the time spent determining current status and next steps.

AWS Migration Hub lets you track the status of your migrations into any AWS region supported by your migration tools.

Regardless of which regions you migrate into, the migration status will appear in Migration Hub when using an integrated tool

Application Discovery Service

AWS Application Discovery Service helps enterprise customers plan migration projects by gathering information about their on-premises data centers.

Planning data center migrations can involve thousands of workloads that are often deeply interdependent.

Server utilization data and dependency mapping are important early first steps in the migration process.

AWS Application Discovery Service collects and presents configuration, usage, and behavior data from your servers to help you better understand your workloads.

The collected data is retained in encrypted format in an AWS Application Discovery Service data store.

You can export this data as a CSV file and use it to estimate the Total Cost of Ownership (TCO) of running on AWS and to plan your migration to AWS.

In addition, this data is also available in AWS Migration Hub, where you can migrate the discovered servers and track their progress as they get migrated to AWS.

Database Migration Service

AWS Database Migration Service helps you migrate databases to AWS quickly and securely.

The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from most widely used commercial and open-source databases.

AWS Database Migration Service supports homogeneous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle or Microsoft SQL Server to Amazon Aurora.

With AWS Database Migration Service, you can continuously replicate your data with high availability and consolidate databases into a petabyte-scale data warehouse by streaming data to Amazon Redshift and Amazon S3.

Server Migration Service

AWS Server Migration Service (SMS) is an agentless service which makes it easier and faster for you to migrate thousands of on-premises workloads to AWS.

AWS SMS allows you to automate, schedule, and track incremental replications of live server volumes, making it easier for you to coordinate large-scale server migrations.

AWS Transfer Family

The AWS Transfer Family provides fully managed support for file transfers directly into and out of Amazon S3 or Amazon EFS.

With support for Secure File Transfer Protocol (SFTP), File Transfer Protocol over SSL (FTPS), and File Transfer Protocol (FTP), the AWS Transfer Family helps you seamlessly migrate your file transfer workflows to AWS by integrating with existing authentication systems, and providing DNS routing with Amazon Route 53 so nothing changes for your customers and partners, or their applications.

With your data in Amazon S3 or Amazon EFS, you can use it with AWS services for processing, analytics, machine learning, archiving, as well as home directories and developer tools.

AWS Snow Family

AWS provides edge infrastructure and software that moves data processing and analysis as close as necessary to where data is created in order to deliver intelligent, real-time responsiveness and streamline the amount of data transferred.

This includes deploying AWS managed hardware and software to locations outside AWS Regions and even beyond AWS Outposts.

The AWS Snow Family helps customers that need to run operations in austere, non-data center environments, and in locations where there’s lack of consistent network connectivity.

The Snow Family, comprised of AWS Snowcone, AWS Snowball, and AWS Snowmobile, offers a number of physical devices and capacity points, most with built-in computing capabilities.

These services help physically transport up to exabytes of data into and out of AWS.

Snow Family devices are owned and managed by AWS and integrate with AWS security, monitoring, storage management, and computing capabilities.

You can improve the transfer speed from your data source to a Snowball device in the following ways, ordered from largest to smallest positive impact on performance:

  1. Use the latest Mac or Linux Snowball client
  2. Batch small files together
  3. Perform multiple copy operations at one time
  4. Copy from multiple workstations
  5. Transfer directories, not files

AWS DataSync

AWS DataSync is an online data transfer service that simplifies, automates, and accelerates moving data between on-premises storage systems and AWS Storage services, as well as between AWS Storage services.

You can use DataSync to migrate active datasets to AWS, archive data to free up on-premises storage capacity, replicate data to AWS for business continuity, or transfer data to the cloud for analysis and processing.

DataSync provides built-in security capabilities such as encryption of data in-transit, and data integrity verification in-transit and at-rest.

It optimizes use of network bandwidth, and automatically recovers from network connectivity failures.

In addition, DataSync provides control and monitoring capabilities such as data transfer scheduling and granular visibility into the transfer process through Amazon CloudWatch metrics, logs, and events.

DataSync can copy data between Network File System (NFS) shares, Server Message Block (SMB) shares, self-managed object storage, AWS Snowcone, Amazon Simple Storage Service (Amazon S3) buckets, Amazon Elastic File System (Amazon EFS) file systems, and Amazon FSx for Windows File Server file systems.

AWS Lambda

 

AWS Lambda

Concurrency

Your functions’ concurrency is the number of instances that serve requests at a given time

For an initial burst of traffic, your functions’ cumulative concurrency in a Region can reach an initial level of between 500 and 3000, which varies per Region.

Burst concurrency quotas:

  • 3000 – US West (Oregon), US East (N. Virginia), Europe (Ireland)
  • 1000 – Asia Pacific (Tokyo), Europe (Frankfurt), US East (Ohio)
  • 500 – Other Regions

When requests come in faster than your function can scale, or when your function is at maximum concurrency, additional requests fail with a throttling error (429 status code).

Throttling can result in the error: “Rate exceeded” and 429 “TooManyRequestsException”

If the above error occurs, verify if you see throttling messages in Amazon CloudWatch Logs but no corresponding data points in the Lambda Throttles metrics.

If there are no Lambda Throttles metrics, the throttling is happening on API calls in your Lambda function code.

Methods to resolve throttling include:

  • Configure reserved concurrency.
  • Use exponential backoff in your application code.

Concurrency metrics:

  • ConcurrentExecutions
  • UnreservedConcurrentExecutions
  • ProvisionedConcurrentExecutions
  • ProvisionedConcurrencyInvocations
  • ProvisionedConcurrencySpilloverInvocations
  • ProvisionedConcurrencyUtilization

Invocations

Synchronous:

  • CLI, SDK, API Gateway.
  • Result returned immediately.
  • Error handling happens client side (retries, exponential backoff etc.).

Asynchronous:

  • S3, SNS, CloudWatch Events etc.
  • Lambda retries up to 3 times.
  • Processing must be idempotent (due to retries).

Event source mapping:

  • SQS, Kinesis Data Streams, DynamoDB Streams.
  • Lambda does the polling (polls the source).
  • Records are processed in order (except for SQS standard).

Traffic Shifting

With the introduction of alias traffic shifting, it is now possible to trivially implement canary deployments of Lambda functions. By updating additional version weights on an alias, invocation traffic is routed to the new function versions based on the weight specified.

Detailed CloudWatch metrics for the alias and version can be analyzed during the deployment, or other health checks performed, to ensure that the new version is healthy before proceeding.

The following example AWS CLI command points an alias to a new version, weighted at 5% (original version at 95% of traffic):

aws lambda update-alias --function-name myfunction --name myalias --routing-config '{"AdditionalVersionWeights" : {"2" : 0.05} }'

AWS Batch

Batch jobs run as Docker images.

Dynamically provisions EC2 instances in a VPC.

Deployment Options:

  • Managed for you entirely (serverless).
  • Manage yourself.

For managed deployments:

  • Choose your pricing model: On-demand or Spot.
  • Choose instance types.
  • Configure VPC/subnets.

Pay for underlying EC2 instances.

Schedule using CloudWatch Events.

Orchestrate with Step Functions.

Can use on-demand or Spot instances.

Multi Node can be used for HPC use cases.

Comparison with Lambda:

  • No execution time limit (Lambda is 15 minutes)
  • Any runtime (Lambda has limited runtimes)
  • Uses EBS for storage (Lambda has limited scratch space; can use EFS if in VPC)
  • Batch using EC2 it is not serverless
  • Can use Fargate with Batch for serverless architecture
  • Lambda is serverless + you pay only for execution time
  • Can be more expensive.

Amazon EC2

Placement Groups

Cluster Placement Groups:

  • A cluster placement group is a logical grouping of instances within a single Availability Zone.
  • A cluster placement group can span peered VPCs in the same Region.
  • Instances in the same cluster placement group enjoy a higher per-flow throughput limit for TCP/IP traffic and are placed in the same high-bisection bandwidth segment of the network.
    Cluster placement groups are recommended for applications that benefit from low network latency, high network throughput, or both.
  • They are also recommended when the majority of the network traffic is between the instances in the group.

Network Adapters

An Elastic Fabric Adapter (EFA) is a network device that you can attach to your Amazon EC2 instance to accelerate High Performance Computing (HPC) and machine learning applications.

EFA enables you to achieve the application performance of an on-premises HPC cluster, with the scalability, flexibility, and elasticity provided by the AWS Cloud.

AWS Elastic Beanstalk

With AWS Elastic Beanstalk you can perform a blue/green deployment, where you deploy the new version to a separate environment, and then swap CNAMEs of the two environments to redirect traffic to the new version instantly.