A quick glance at the list of services available on AWS can be overwhelming. As of September 2020, there are over 160 AWS services, some of which have informative names, such as simple storage service (S3), and others which are incredibly obtuse, like Neptune and Greengrass. The number and variety of services has even inspired catchy songs. Looking at the list, you might start to think, “do I need to use all of those to build my application?” and fortunately the answer is no, you don’t! Most applications use just a handful of services because all the services ultimately can be broken down into two categories: storage and computing.
Implementing cloud services is a series of trade-offs between various services. Debating these trade-offs can be simplified by first trying to distinguish which services are storage and computing, and then looking for features within each of these groups that makes one service more well-suited for your use case than the others. Oftentimes, multiple services could address your need, and then you might consider factors such as cost, maintenance, and accessibility.
At its core, AWS is really just like your personal computer. When you finish writing a document, you save it. This action of saving tells the computer to allocate a portion of the hard drive’s memory to store the document so you can retrieve it later. When you edit the document at a later point, your computer changes the stored information, in essence computing new values, until you save it again. Computing is any form of data transformation done to accomplish a task.
But wait, aren’t storage and compute categories in the list of services? How can all the services fit into those two categories?
Yes, AWS does list storage and compute as specific categories alongside many trigger on a repository;more, but this categorization is misleading. These two categories are only the services that are commonly understood to be storage and compute. When you start to read about the other services you will start to see that most of them are a form of storage or compute that are tailored to a specific purpose.
Let’s breakdown some of the other services to see how analyzing each service as either storage or computing can make it easier to identify the appropriate services for your work.
Database
All the services under Database are storage formats fit for specific use cases.
The Relational Database Service (RDS) is a standard relational database fit for well-structured data where your primary goal is to run analytics over the data rather than retrieve individuals rows. These queries are common in business intelligence where aggregating and summarizing values is commonplace.
DynamoDB is a non-relational database that is well-suited for storing information where retrieving individual rows is critical. Maybe you want to build an e-commerce website where customers browse a catalog, read about each product’s features, and move specific products to a virtual shopping cart. You might store both your catalog and the basket in DynamoDB tables because this type of database is optimized for storing and retrieving information about individual items.
Other database services are more specific because the data types present challenges for other relational and non-relational databases. Neptune is a graph database useful for network analysis and Timestream stores time series data.
Machine Learning
Machine Learning is an incredibly common use case in cloud computing because these tasks often require more computational resources than your personal computer can realistically provide. At their core, all of the Machine Learning services simplify the process of transforming data. With this in mind, you can scan the long list of AWS services for the computing services that you need.
You could deploy your own machine learning model using generic computing services, such as EC2, but you would need to configure the environment to ensure the server had all of the right packages to run the model. Additionally, you would need to manage how the servers scaled to meet changing demand.
AWS provides the SageMaker service to take the headache out of deploying machine learning models. You simply pick a predefined machine learning model or supply your own, and AWS will ensure all the appropriate resources are available and the underlying EC2 instances scale up and down automatically to meet demand. Many of the other services under this category focus on computing for specific machine learning tasks, such as forecasting with AWS Forecast or natural language processing with AWS Comprehend.
Suites of Services For Common Tasks
Some of the other categories are a combination of storage and compute because specific use cases require highly specialized storage and computing options. Let’s breakdown one of these categories to see how AWS has developed specialized storage and computing services for common developer tasks.
Developer Tools
Code has to be stored somewhere. You might be familiar with GitHub or BitBucket, two of the most popular websites for hosting code. These websites allow you to save your code in a repository and collaborate with others. AWS has their own service for hosting code repositories, CodeCommit. This service is a storage service because it simply saves versions of your code.
Oftentimes, a developer might want to trigger an action whenever a new version of a file is added to a repository. The trigger might update the content on a website or fix a bug in a mobile app. AWS has services that make it easy to set up these triggers and ensure that any changes do not break your pre-existing application. CodeBuild, CodeDeploy, and CodePipeline are a suite of services that manage the process of testing and deploying any changes. At their core, these services identify changes in the CodeCommit repository and transform your application to meet the changes.
You could use open-source tools to complete many of the tasks these AWS services accomplish, but it is often simpler and cheaper to use AWS’ service because it has been specifically developed to work with other AWS services. But beware, heavy reliance on these services could make it more difficult to move to a different cloud service provider.
Services for Common Types of Data
Software applications handle many types of data that are often not your primary area of expertise. You might need to manage database credentials to ensure no one accesses your application’s content without authentication or you might need to store data temporarily until your application is ready to process it.
For these use cases, you could use a generic storage service, like S3, but you would also need to control data security or even manage servers to store the data effectively. Luckily, there are specialty services that address these issues, freeing developers to focus on building applications. Many of these services do not seem like storage, but when you remove the special features, they all manage data storage in ways best suited for the specific data type.
Let’s look at two examples of common data types and how AWS services address their unique issues. Temporary Data: Many applications need to communicate information between services. For example, imagine you are developing an application to apply computer vision to images captured from a video camera. The camera will stream data continuously, maybe dozens or hundreds of frames every second. Your computing service might take several seconds to process each image, creating a bottleneck in your pipeline. What do you do with incoming data until your computing service is ready to process it? You could store each item in S3 and tell your computing service to look for items there and then delete them, but then you would accrue S3 costs for items that only exist for a short amount of time. If the order of items matters, you would also need to manually configure your computing service to ensure the items are processed in the right order.
AWS offers services like Simple Queue Service (SQS) and Kinesis specifically for storing and managing temporary data. These services can ingest thousands or millions of records per second, maintain the order that items are received, and also push the items to your computing services automatically. These features start to blur the line between storage and computing, but at their core, these services make storing temporary data easier.
Sensitive Data: Databases often require a username and password to authenticate users before they can read and write information, and passwords need to special procedures, such as encryption, to maintain security. The Key Management Service (KMS) stores sensitive information, like database credentials, and provides a variety of security features such as encryption key rotation to ensure your credentials are protected.
Takeaways
Don’t let the long list of services scare you; AWS services attempt to address the common needs for developers in many different disciplines.
All AWS services can be classified as either storage or computing. Some are generic and easy to identify while others are highly specialized and provide additional features that blur the line between storage and computing. To identify the right service for your use case, just ask yourself a couple questions: Am I looking to store or transform data? Does AWS provide specialty services for storing and computing the type of data I am handling?