Category: Cloud Architecting

A File Extraction Project

I had a client approach me regarding a set of files they had. The files were a set of certificates to support their products. They deliver these files to customers in the sales process.

The workflow currently involves manually packaging the files up into a deliverable format. The client asked me to automate this process across their thousands of documents.

As I started thinking through how this would work, I decided to create a serverless approach utilizing Amazon S3 for document storage and Lambda to do the processing and Amazon S3 and Cloudfront to generate a front end for the application.

My current architecture involves two S3 buckets. One bucket to store the original PDF documents and one to pull in the documents that we are going to package up for the client before sending.

The idea is that we can tag each PDF file with its appropriate lot number supplied by the client. I will then use a simple form submission process to supply input into the function that will collect the required documents.

Here is the code for the web frontend:

<!DOCTYPE html>
<html>
<head>
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.4/jquery.min.js"></script>
    <script type="text/javascript">
        $(document).ready(function() {

            $("#submit").click(function(e) {
                e.preventDefault();

                var lot = $("#lot").val();

                $.ajax({
                    type: "POST",
                    url: 'API_URLHERE',
                    contentType: 'application/json',
                    data: JSON.stringify({
                        'body': lot,
                    }),
                    success: function(res){
                        $('#form-response').text('Query Was processed.');
                    },
                    error: function(){
                        $('#form-response').text('Error.');
                    }
                });

            })

        });
    </script>
</head>
<body>
<form>
    <label for="lot">Lot</label>
    <input id="lot">
    <button id="submit">Submit</button>
</form>
<div id="form-response"></div>
</body>
</html>

This is a single field input form that sends a string to my Lambda function. Once the string is received we will convert it into a JSON object and then use that to find our objects within Amazon S3.

Here is the function:

import boto3
import json


def lambda_handler(event, context):
    form_response = event['body']
    tag_list = json.loads(form_response)
    print(tag_list)
    tag_we_want = tag_list['body']
    
    
    
    s3 = boto3.client('s3')
    bucket = "source_bucket"
    destBucket = "destination_bucket"
    download_list = []
    #get all the objects in a bucket
    get_objects = s3.list_objects(
        Bucket= bucket,
    )

    object_list = get_objects['Contents']

    object_keys = []
    for object in object_list:
        object_keys.append(object['Key'])

    object_tags = []
    for key in object_keys:
        object_key = s3.get_object_tagging(
            Bucket= bucket,
            Key=key,
        )

        object_tags.append(
            {
            'Key': key,
            'tags': object_key['TagSet'][0]['Value']
            }
        )

    for tag in object_tags:

        if tag['tags'] == tag_we_want:
            object_name = tag['Key']
            s3.copy_object(
                Bucket= destBucket,
                CopySource= {
                    'Bucket': bucket,
                    'Key': object_name,
                },
                Key= object_name,
            )
            download_list.append(object_name)

    return download_list, tag_we_want

In this code, we define our source and destination buckets first. With the string from the form submission, we first gather all the objects within the bucket and then iterate over each object to find matching tags.

Once we gather the files we want for our customers we then transfer these files to a new bucket. I return the list of files out of the function as well as the tag name.

My next step is to package all the files required into a ZIP file for downloading. I first attempted to do this in Lambda but quickly realized you cannot use Lambda to generate files as the file system is read only.

Right now, I am thinking of utilizing Docker to spawn a worker which will generate the ZIP file, place it back into the bucket and provide a time-sensitive download link to the client.

Stay tuned for more updates on this project.

December 7, 2020

A Self Hosted Server Health Check

I’m not big on creating dashboards. I find that I don’t look at them enough to warrant hosting the software on an instance and having to have the browser open to the page all the time.

Instead, I prefer to be alerted via Slack as much as possible. I wrote scripts to collect DNS records from Route53. I decided that I should expand on the idea and create a scheduled job that would execute at a time interval. This way my health checks are fully automated.

Before we get into the script, you might ask me why I don’t just use Route53 health checks! The answer is fairly simple. First, the cost of health checks for HTTPS doesn’t make sense for the number of web servers that I am testing. Second, I don’t want to test Route53 or any AWS resource from within AWS. Rather, I would like to use my own network to test as it is not connected to AWS.

You can find the code and the Lambda function hosted on GitHub. The overall program utilizes a few different AWS products:

Lambda
SNS
CloudWatch Logs

It also uses Slack but that is an optional piece that I will explain. The main functions reside in “main.py”. This piece of code follows the process of:

Iterating over Route53 Records
Filtering out “A” records and compiling a list of domains
Testing each domain and processing the response code
Logging all of the results to CloudWatch Logs
Sending errors to the SNS topic

I have the script running on a CRON job every hour.

The second piece of this is the Lambda function. The function is all packaged in the “lambda_function.zip” but, I also added the function outside of the ZIP file for editing. You can modify this function to utilize your Slack credentials.

The Lambda function is subscribed to your SNS topic so that whenever a new message appears, that message is sent to your specified Slack channel.

I have plans to test my Terraform skills to automate the deployment of the Lambda function, SNS topic, CloudWatch Logs, and the primary script in some form.

If you have any comments on how I could improve this function please post a comment here or raise an issue on GitHub. If you find this script helpful in anyway feel free to share it with your friends!

Links:
Server Health Check – GitHub

Code – Main Function (main.py)

import boto3
import requests
import os
import time


#aws variables
sns = boto3.client('sns')
aws = boto3.client('route53')
cw = boto3.client('logs')
paginator = aws.get_paginator('list_resource_record_sets')
response = aws.list_hosted_zones()
hosted_zones = response['HostedZones']
time_now = int(round(time.time() * 1000))

#create empty lists
zone_id_to_test = []
dns_entries = []
zones_with_a_record = []
#Create list of ZoneID's to get record sets from       
for key in hosted_zones:
    zoneid = key['Id']
    final_zone_id = zoneid[12:]
    zone_id_to_test.append(final_zone_id)

#Create ZoneID List    
def getARecord(zoneid):
    for zone in zoneid:
        try:
            response = paginator.paginate(HostedZoneId=zone)
            for record_set in response:
                dns = record_set['ResourceRecordSets']
                dns_entries.append(dns)

        except Exception as error:
            print('An Error')
            print(str(error))
            raise
#Get Records to test
def getCNAME(entry):
    for dns_entry in entry:
        for record in dns_entry:
            if record['Type'] == 'A':
                url = (record['Name'])
                final_url = url[:-1]
                zones_with_a_record.append(f"https://{final_url}")
#Send Result to SNS                
def sendToSNS(messages):
    message = messages
    try:
        send_message = sns.publish(
            TargetArn='YOUR_SNS_TOPIC_ARN_HERE',
            Message=message,
            )
    except:
        print("something didn't work")
def tester(urls):
    for url in urls:
        try:
            user_agent = {'User-agent': 'Mozilla/5.0'}
            status = requests.get(url, headers = user_agent, allow_redirects=True)
            code = (status.status_code)
            if code == 401:
                response = f"The site {url} reports status code: {code}"
                writeLog(response)
            elif code == 301:
                response = f"The site {url} reports status code: {code}"
                writeLog(response)
            elif code == 302:
                response = f"The site {url} reports status code: {code}"
                writeLog(response)
            elif code == 403:
                response = f"The site {url} reports status code: {code}"
                writeLog(response)
            elif code !=200:
                sendToSNS(f"The site {url} reports: {code}")
                response = f"The site {url} reports status code: {code}"
                writeLog(response)
            else:
                response = f"The site {url} reports status code: {code}"
                writeLog(response)
        except:
            sendToSNS(f"The site {url} failed testing")
            response = f"The site {url} reports status code: {code}"
            writeLog(response)

def writeLog(message):
    getToken = cw.describe_log_streams(
        logGroupName='healthchecks',   
        )
    logInfo = (getToken['logStreams'])
    nextToken = logInfo[0]['uploadSequenceToken']
    response = cw.put_log_events(
        logGroupName='YOUR_LOG_GROUP_NAME',
        logStreamName='YOUR_LOG_STREAM_NAME',
        logEvents=[
            {
                'timestamp': time_now,
                'message': message
            },
        ],
        sequenceToken=nextToken
    )
#Execute            
getARecord(zone_id_to_test)
getCNAME(dns_entries)
tester(zones_with_a_record)

Code: Lambda Function (lambda_function.py)

import logging
logging.basicConfig(level=logging.DEBUG)

import os
from slack import WebClient
from slack.errors import SlackApiError


slack_token = os.environ["slackBot"]
client = WebClient(token=slack_token)

def lambda_handler(event, context):
    detail = event['Records'][0]['Sns']['Message']
    response_string = f"{detail}"
    try:
        response = client.chat_postMessage(
            channel="YOUR CHANNEL HERE",
            text="SERVER DOWN",
            blocks = [{"type": "section", "text": {"type": "plain_text", "text": response_string}}]
        )   

    except SlackApiError as e:
        assert e.response["error"]
    return

December 3, 2020

Where Is It 5 O’Clock Pt: 4
As much as I’ve scratched my head working on this project it has been fun to learn some new things and build something that isn’t infrastructure automation. I’ve learned some frontend web development some backend development and utilized some new Amazon Web Services products.

With all that nice stuff said I’m proud to announce that I have built a fully functioning project that is finally working the way I intended it. You can visit the website here:

www.whereisitfiveoclock.net

To recap, I bought this domain one night as a joke and thought “Hey, maybe one day I’ll build something”. I started off building a fully Python application backed by Flask. You can read about that in Part 1.This did not work out the way I intended as it did not refresh the timezones on page load. In part 3 I discussed how I was rearchitecting the project to include an API that would be called upon page load.

The API worked great and delivered two JSON objects into my frontend. I then parsed the two JSON objects into two separate tables that display where you can be drinking and where you probably shouldn’t be drinking.

This is a snippet of the JavaScript I wrote to iterate over the JSON objects while adding them into the appropriate table:
```
function buildTable(someinfo){
                var table1 = document.getElementById('its5pmsomewhere')
                var table2 = document.getElementById('itsnot5here')
                var its5_json = JSON.parse(someinfo[0]);
                var not5_json = JSON.parse(someinfo[1]);
                var its5_array = []
                var not5_array = []
                its5_json['its5'].forEach((value, index) => {

                    var row = `<tr>
                                <td>${value}</td>
                                <td></td>
                                </tr>`
                
                    table1.innerHTML += row
                })  
                not5_json['not5'].forEach((value, index) => {

                        var row = `<tr>
                                <td></td>
                                <td>${value}</td>
                                </tr>`
                
                    table2.innerHTML += row
                })  
```
First I reference two different HTML tables. I then parse the JSON from the API. I take both JSON objects and iterate over them adding the timezones into the table and then returning them into the HTML table.

If you want more information on how I did this feel free to reach out.

I want to continue iterating over this application to add new features. I need to do some standard things like adding Google Analytics so I can track traffic. I also want to add a search feature and a map that displays the different areas of drinking acceptability.

I also am open to requests. One of my friends suggested that I add a countdown timer to each location that it is not yet acceptable to be drinking.

Feel free to reach out in the comments or on your favorite social media platform! And as always, if you liked this project please share it with your friends.
September 23, 2020
Where Is It Five O’Clock Pt: 3

So I left this project at a point where I felt it needed to be re-architected based on the fact that Flask only executes the function once and not every time the page loads.

I re-architected the application in my head to include an API that calls the Lambda function and returns a list of places where it is and is not acceptable to be drinking based on the 5 O’Clock rules. These two lists will be JSON objects that have a single key with multiple values. The values will be the timezones appropriate to be drinking in.

After the JSON objects are generated I can reference them through the web frontend and display them in an appropriate way.

At this point I have the API built out and fully funcitoning the way I think I want it. You can use it by executing the following:
curl https://5xztnem7v4.execute-api.us-west-2.amazonaws.com/whereisit5

I will probably only have this publically accessible for a few days before locking it back down.

Hopefully, in part 4 of this series, I will have a frontend demo to show!

September 22, 2020
Where Is It 5 O’Clock Pt: 2

So I spend the evening deploying this web application to Amazon Web Services. In my test environment, everything appeared to be working great because every time I reloaded the page it reloaded the function as well.

When I transferred this over to a live environment I realized the Python function only ran every time I committed a change and it was re-deployed to my Elastic Beanstalk environment.

This poses a new problem. If the function doesn’t fire every time the page is refreshed the time won’t properly update and it will show incorrect areas of where it is 5 O’Clock. Ugh.

So, over the next few weeks, in my spare time, I will be re-writing this entire application to function the way I intended it to.

I think to do this I will write each function as an AWS Lambda function and then write a frontend that calls these functions on page load. Or, the entire thing will be one function and return the information and it will deploy in one API call.

I also really want to display a map that shows the areas that it is 5PM or later but I think this will come in a later revision once the project is actually functioning correctly. Along with some more CSS to make it pretty and responsive so it works on all devices.

The punch list is getting long…

Follow along here: https://whereisitfiveoclock.net

September 18, 2020
Where Is It Five O’Clock Pt: 1
I bought the domain whereisitfiveoclock.net a while back and have been sitting on it for quite some time. I had an idea to make a web application that would tell you where it is five o’clock. Yes, this is a drinking website.

I saw this project as a way to learn more Python skills, as well as some more AWS skills, and boy, has it put me to the test. So I’m going to write this series of posts as a way to document my progress in building this application.

Part One: Building The Application

I know that I want to use Python because it is my language of choice. I then researched what libraries I could use to build the frontend with. I came across Flask as an option and decided to run with that. The next step I had to do was actually find out where it was 5PM.

In my head, I came up with the process that if I could first get a list of all the timezone and identify the current time in them I could filter out which timezones it was 5PM. Once establishing where it was 5PM, I can then get that information to Flask and figure out a way to display it.

Here is the function for identifying the current time in all timezones and then storing each key pair of {Timezone : Current_Time }
```
def getTime():
    now_utc = datetime.now(timezone('UTC'))
    #print('UTC:', now_utc)
    timezones = pytz.all_timezones
    #get all current times and store them into a list
    tz_array = []
    for tz in timezones:
        current_time = now_utc.astimezone(timezone(tz))
        values = {tz: current_time.hour}
        tz_array.append(values)
        
    return tz_array
```
Once everything was stored into tz_array I took that info and passed it through the following function to identify it was 5PM. I have another function that identifies everything that is NOT 5PM.
```
def find5PM():
    its5pm = []
    for tz in tz_array:
        timezones = tz.items()
        for timezone, hour in timezones:
            if hour >= 17:
                its5pm.append(timezone)
    return its5pm
```
I made a new list and stored just the timezone name into that list and return it.

Once I had all these together I passed them through as variables to Flask. This is where I first started to struggle. In my original revisions of the functions, I was only returning one of the values rather than returning ALL of the values. This resulted in hours of struggling to identify the cause of the problem. Eventually, I had to start over and completely re-work the code until I ended up with what you see above.

The code was finally functional and I was ready to deploy it to Amazon Web Services for public access. I will discuss my design and deployment in Part Two.

http://whereisitfiveoclock.net
September 17, 2020

EC2 Action Slack Notification

I took a brief break from my Lambda function creation journey to go on vacation but, now i’m back!

This function will notify a Slack channel of your choosing when an EC2 instance enters “Starting, Stopping, Stopped, or Shutting-Down” status. I thought this might be useful for instances that reside under a load balancer. It would be useful to see when your load balancer is scaling up or down in real-time via Slack notification.

In order to use this function, you will need to create a Slack Application with an OAuth key and set that key as an environment variable in your Lambda function. If you are unsure of how to do this I can walk you through it!

Please review the function below

import logging
import requests
import boto3
import os
from urllib.parse import unquote_plus
from slack import WebClient
from slack.errors import SlackApiError
logging.basicConfig(level=logging.DEBUG)

# Check EC2 Status
def lambda_handler(event, context):
    detail = event['detail']
    ids = detail['instance-id']
    eventname = detail['state']
    ec2 = boto3.resource('ec2')
# Slack Variables
    slack_token = os.environ["slackBot"]
    client = WebClient(token=slack_token)
    channel_string = "XXXXXXXXXXXXXXXXXXXX"

# Post to slack that the instance is running
    if eventname == 'running':
        try:
          instance = ids
          response_string = f"The instance: {instance} has started"
          response = client.chat_postMessage(
            channel= channel_string,
          	text="An Instance has started",
           	blocks = [{"type": "section", "text": {"type": "plain_text", "text": response_string}}]
	        	)
        except SlackApiError as e:
          assert e.response["error"]  

		#Post to slack that instance is shutting down
    elif eventname == 'shutting-down':
    	try:
	        instance = ids
	        response_string = f"The instance: {instance} is shutting down"
	        response = client.chat_postMessage(
	        	channel= channel_string,
	        	text="An Instance is Shutting Down",
	        	blocks = [{"type": "section", "text": {"type": "plain_text", "text": response_string}}]
	        	)
    	except SlackApiError as e:
           assert e.response["error"]
	      	
    elif eventname == 'stopped':
    	try:
	        instance = ids
	        response_string = f"The instance: {instance} has stopped"
	        response = client.chat_postMessage(
	        	channel= channel_string,
	        	text="An Instance has stopped",
	        	blocks = [{"type": "section", "text": {"type": "plain_text", "text": response_string}}]
	        	)
    	except SlackApiError as e:
    		assert e.response["error"]
	      	
    elif eventname == 'stopping':
    	try:
	        instance = ids
	        response_string = f"The instance: {instance} is stopping"
	        response = client.chat_postMessage(
	        	channel= channel_string,
	        	text="An Instance is stopping",
	        	blocks = [{"type": "section", "text": {"type": "plain_text", "text": response_string}}]
	        	)
    	except SlackApiError as e:
    		assert e.response["error"]

As always the function is available on GitHub as well:
https://github.com/avansledright/ec2ActionPostToSlack

If you find this function helpful please share it with your friends or repost it on your favorite social media platform!

September 11, 2020

Check EC2 Instance Tags on Launch
In my ever-growing quest to automate my AWS infrastructure deployments, I realized that just checking my tags wasn’t good enough. I should force myself to put tags in otherwise my instances won’t launch at all.

I find this particularly useful because I utilize AWS Backup to do automated snapshots nightly of all of my instances. If I don’t put the “Backup” tag onto my instance it will not be included in the rule. This concept of forced tagging could be utilized across many different applications including tagging for development, production, or testing environments.

To do this I created the Lambda function below. Utilizing EventBridge I have this function every time there is an EC2 instance that enters the “running” state.
```
import json
import boto3

def lambda_handler(event, context):
    detail = event['detail']
    ids = detail['instance-id']
    eventname = detail['state']
    ec2 = boto3.resource('ec2')
    
    while eventname == 'Running':
        print(ids)       
    #Check to see if backup tag is added to the instance
        tag_to_check = 'Backup'
        instance = ec2.Instance(ids)
        for tag in instance.tags:
            if tag_to_check not in [t['Key'] for t in instance.tags]:
                instance.stop()
                print("Stopping Instance: ", instance)
    #Get instance state to break the infinite loop
                state = instance.state['Name']          
                if state == "shutting-down":
                    print("instance is shutting-down")
                    break
                elif state == "stopped":
                    print("Instance is already stopped")
                    break
                elif state == "stopping":
                    print("instance is stopping")
                    break
        break
            
```
The function then will check the status of the instance to ensure that it is stopped and then break the loop.

You can clone the repository from GitHub here:
https://github.com/avansledright/aws-force-ec2-launch-tags

If you utilize the script please share it with your friends. Feel free to modify it as you please and let me know how it works for you! As always, if you have any questions feel free to reach out here or on any other platform!
September 2, 2020
AWS Tag Checker
I wrote this script this morning as I was creating a new web server. I realized that I had been forgetting to add my “Backup” tag to my instances so that they would automatically be backed up via AWS Backup.

This one is pretty straight forward. Utilizing Boto3 this script will iterate over all of your instances and check them for the tag specified on line 8. If the tag is not present it will then add the tag that is defined by JSON in $response.

After that is all done it will iterate over the instances again to check that the tag has been added. If a new instance has been added or it failed to add the tag it will print out a list of instance ID’s that do not have the tag.

Here is the script:
```
import boto3


ec2 = boto3.resource('ec2')
inst_describe = ec2.instances.all()

for instance in inst_describe:
    tag_to_check = 'Backup'
    if tag_to_check not in [t['Key'] for t in instance.tags]:
        print("This instance is not tagged: ", instance.instance_id)
        response = ec2.create_tags(
            Resources= [instance.instance_id],
            Tags = [
                {
                    'Key': 'Backup',
                    'Value': 'Yes'
                }
            ]
        )
# Double check that there are no other instances without tags
for instance in inst_describe:
    if tag_to_check not in [t['Key'] for t in instance.tags]:
        print("Failed to assign tag, or new instance: ", instance.instance_id)        
```
The script is also available on GitHub here:
https://github.com/avansledright/awsTagCheck

If you find this script helpful feel free to share it with your friends and let me know in the comments!
September 1, 2020
Lambda Function Post to Slack

I wrote this script out of a need to practice my Python skills. The idea is that if a file gets uploaded to an S3 bucket then the function will trigger and a message with that file name will be posted to a Slack channel of your choosing.

To utilize this you will need to include the Slack pip package as well as the slackclient pip package when you upload the function to the AWS Console.

You will also need to create an OAuth key for a Slack application. If you are unfamiliar with this process feel free to drop a comment below and or shoot me a message and I can walk you through the process or write a second part of the guide.

Here is a link to the project:
https://github.com/avansledright/posttoSlackLambda

If this helps you please share this post on your favorite social media platform!

August 31, 2020