Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

What is the AWS Data Egress Waiver? 

AWS offers a Data Egress Waiver that mostly eliminates Data Transfer charges for Cornell. See the AWS blog post about it for more details.

...

How much are Data Transfer Charges for Cornell? Are we close to the 15% cap for the AWS Data Egress Waiver?

This chart of recent billing data shows that In short, no, Cornell is not near the 15% cap of the Data Egress Waiver.

...

. As of March 2023, the last three months averaged 6.83% utilization and the last six averaged 6.16%. If more detailed information is needed, please contact cloud-support@cornell.edu.

Billing

I got an invoice for AWS from CloudCheckr. What should I do with it?

...

As of June 2022, individual Cornell AWS accounts cannot buy Reserved Instances or Savings Plans. Cornell has a program that purchases those centrally. See 

Licensing

For more information, please contact the Cloud Team

Licensing

Does Does the Cornell Microsoft Agreement cover Microsoft software in AWS?

...

What is the Cornell Standard VPC?

See The Cornell “Standard” AWS VPC.

Why can't I connect to my EC2 instance?

...

Code Block
$ traceroute -T -p 389 ad10.cornell.edu
traceroute to ad10.cornell.edu (10.92.36.80), 30 hops max, 60 byte packets
 1  ip-10-92-36-80.ec2.internal (10.92.36.80)  7.740 ms  7.711 ms  9.136 ms

Working with Data

When should I use Direct Connect and when should I use the public internet to transfer data?

Direct Connect is mostly useful when a reliable latency is needed to be maintained between systems on campus and in AWS. Another use case could be that you are required to use a private network due to some policy, or you must access a system on campus that will not allow access via the public internet due to firewall rules that cannot be changed or because the system is only in campus 10-Space.

In the majority of other scenarios, the Cloud Team recommends using the public internet to transfer all data and updating firewall configurations to allow access to/from the internet with trusted systems that you run in AWS. The available bandwidth to the internet is much greater than the 1Gbps Direct Connect that is shared among many units at Cornell.

We also recommend using end-to-end encryption whenever transferring data over the internet. If you are using AWS provided CLI or SDKs (or 3rd party tools that utilize these) to transfer data to AWS, your connections will be encrypted by default.

How do I transfer a large file (>1GB) to Amazon S3?

Amazon S3 supports individual objects up to 5TB in size. However, when uploading large files, you run the risk of that transfer being interrupted and having to start over. Each individual connection to S3 also only gets 100Mbps from AWS.  

We recommend using the AWS CLI or a 3rd party tool to utilize "multipart uploads" when transferring large files. Most tools also multithread when uploading the parts of your file, so you will be able to utilize the full bandwidth of your machine (usually 1Gbps on campus).

The following tools support multipart uploads:

STS Token use for manual data transfers with existing shibboleth IAM roles

There are some options here:

  1. Install the aws login tool (Access Keys for AWS CLI Using Cornell Two-Step Login - Shibboleth
  2. Docker with the aws login tool with other helpful cloud utilities (https://github.com/CU-CommunityApps/ct-cloud-utils-dockerized)
  3. Install the aws cli (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html) using 'aws sts get-session-token' with a new or existing IAM user (https://docs.aws.amazon.com/cli/latest/reference/sts/get-session-token.html)
    1. Create a new or use default profile
    2. "aws configure --profile {name}"

...

Do I need multiple NAT Gateways?

VPCs created by the Cloud Team for Cornell AWS customers generally contain only a single NAT Gateway. This NAT Gateway provides access to the public internet for private subnets in the VPC. All private subnets in the VPC are configured to use the same NAT Gateway, regardless of the Availability Zone of the private subnet. This means that the NAT Gateway is a single point of failure because the resources in your private subnets may not be able to reach the internet if the AZ where the NAT Gateway resides experiences network issues.

If you require high availability and resiliency for the deployments in your private subnets, you may want to consider adding additional NAT Gateways to your VPC. You would want one NAT Gateway in each Availability Zone where your private subnets reside.

The downside of multiple NAT Gateways is that each one costs about $1/day to run, and some Cornell AWS customers do not consider the high availability worth that cost. 

Email cloud-support@cornell.edu if you'd like help setting up additional NAT Gateways in your Cornell AWS account.

Working with Data

When should I use Direct Connect and when should I use the public internet to transfer data?

Direct Connect is mostly useful when a reliable latency is needed to be maintained between systems on campus and in AWS. Another use case could be that you are required to use a private network due to some policy, or you must access a system on campus that will not allow access via the public internet due to firewall rules that cannot be changed or because the system is only in campus 10-Space.

In the majority of other scenarios, the Cloud Team recommends using the public internet to transfer all data and updating firewall configurations to allow access to/from the internet with trusted systems that you run in AWS. The available bandwidth to the internet is much greater than the 1Gbps Direct Connect that is shared among many units at Cornell.

We also recommend using end-to-end encryption whenever transferring data over the internet. If you are using AWS provided CLI or SDKs (or 3rd party tools that utilize these) to transfer data to AWS, your connections will be encrypted by default.

How do I transfer a large file (>1GB) to Amazon S3?

Amazon S3 supports individual objects up to 5TB in size. However, when uploading large files, you run the risk of that transfer being interrupted and having to start over. Each individual connection to S3 also only gets 100Mbps from AWS.  

We recommend using the AWS CLI or a 3rd party tool to utilize "multipart uploads" when transferring large files. Most tools also multithread when uploading the parts of your file, so you will be able to utilize the full bandwidth of your machine (usually 1Gbps on campus).

The following tools support multipart uploads:

STS Token use for manual data transfers with existing shibboleth IAM roles

There are some options here:

  1. Install the aws login tool (Access Keys for AWS CLI Using Cornell Two-Step Login - Shibboleth

...

  1. rclone config
    1. set id, secret and session token (under advanced config)

...

  1. Copy ID, Secret and Token from ~/.aws/credentials {name}
    1. aws_access_key_id = [ paste ID ]
    2. aws_secret_access_key = [ paste key ]
    3. aws_session_token = [ paste token ]
  2. Download Cyberduck STS token profile
  3. Open Connection - S3 (Credentials from AWS Security Token Service)
    1. Specify profile from #1
      Image Removed

...

  1. Docker with the aws login tool with other helpful cloud utilities (https://github.com/CU-CommunityApps/ct-cloud-utils-dockerized)
  2. Install the aws cli (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html) using 'aws sts get-session-token' with a new or existing IAM user (https://docs.aws.amazon.com/cli/latest/reference/sts/get-session-token.html)
    1. Create a new or use default profile
    2. "aws configure --profile {name}"


  • AWS CLI
    1. Access Keys for AWS CLI Using Cornell Two-Step Login - Shibboleth
  • rclone
    1. rclone config
      1. set id, secret and session token (under advanced config)
  • Cyberduck 
    1. Copy ID, Secret and Token from ~/.aws/credentials {name}
      1. aws_access_key_id = [ paste ID ]
      2. aws_secret_access_key = [ paste key ]
      3. aws_session_token = [ paste token ]
    2. Download Cyberduck STS token profile
    3. Open Connection - S3 (Credentials from AWS Security Token Service)
      1. Specify profile from #1
        Image Added
  • Mountain Duck now available with similar process as outlined above with CyberDuck.

Mechanical Turk (MTurk)

Can I use Mechanical Turk with my Cornell AWS account?

  • Mechanical Turk requester accounts can use the same email address and password as AWS root accounts.
    However, in order to keep these concerns separate, we recommend using different email accounts for each of AWS, Amazon.com retail store, and Mechanical Turk. 

  • As of December 2020, MTurk accounts can be linked to AWS accounts for billing purposes.
    MTurk accounts linked like that have their charges included in the charges for the AWS account. Please contact cloud-support@cornell.edu to link your MTurk account.
    • With this linkage, research awards/credits issued by AWS to an AWS account can be used for paying MTurk charges.
    • Only one MTurk account can be linked to each AWS account.
    • In order to establish this linkage, the root credentials for the AWS account must be used.
      If the Cloud Team manages the root credentials for an AWS account, we will be happy to help establish this linkage. Please contact cloud-support@cornell.edu.

  • As of August 2023 , New MTurk Requester accounts that are created and linked to AWS accounts , the Requester UI is ONLY available using the ROOT login for the account , not another email account.  
    • One can specify on the MTurk Account page a Requster name and alternate email for Contact by workers , and a Display Name .
    • One can still use an SDK / CLI script management of the Requester using User Access Keys in the AWS Account that have Mechankical Turk permissions thru attached policies. 
    • Alternatively one can use a 3rd party solution like CloudResearch that can provide a UI and use AccessKeys to manage the MTurk Requester functions; Creting HITs,  setup Sandbox,  and interact with Workers, vet workers etc.  

Mechanical Turk (MTurk)

Can I use Mechanical Turk with my Cornell AWS account?

  • Mechanical Turk requestor accounts can use the same email address and password as AWS root accounts.
    However, in order to keep these concerns separate, we recommend using different accounts for each of AWS, Amazon.com retail store, and Mechanical Turk. 
    As of December 2020, MTurk accounts can be linked to AWS accounts for billing purposes.
    MTurk accounts linked like that have their charges included in the charges for the AWS account. Please contact cloud-support@cornell.edu to link your MTurk account.
    • With this linkage, research awards/credits issued by AWS to an AWS account can be used for paying MTurk charges.
    • Only one MTurk account can be linked to each AWS account.
    • In order to establish this linkage, the root credentials for the AWS account must be used.
      If the Cloud Team manages the root credentials for an AWS account, we will be happy to help establish this linkage. Please contact cloud-support@cornell.edu.

Can I use tagging in Mechanical Turk?

...