Your cart is currently empty!
Tag: python
-
Automating Security Group Rule Removal
I’m using an Amazon Web Services Security Group as a way to allow traffic into an EC2 instance for the instance’s users. The users can give themselves access through a web interface that I wrote for them. Maybe I’ll cover that in a different post.
I found recently that the Security Group was nearing its maximum rule list. So I decided to start purging rules which would ultimately force them to re-add their IP addresses to the group.
Going in and manually removing rules is rather time-consuming. I figured I could write a script that would handle it for me. The first step was to update my previous script that inserts the rule to add a tag to the rule. The function below takes input of Security Group Id’s as a list and returns all of the rules.
def get_sg_rules(sg_id): client = boto3.client('ec2') response = client.describe_security_group_rules( Filters=[ { 'Name': 'group-id', 'Values': sg_id } ], ) return response
The script below iterates through each of the rules returned and will append the tag of “dateAdded” and a stringified date code.
for sg_rule in get_sg_rules(sg_list)['SecurityGroupRules']: try: client = boto3.client('ec2') response = client.create_tags( DryRun=False, Resources=[ sg_rule['SecurityGroupRuleId'], ], Tags=[ { 'Key': 'dateAdded', 'Value': '2022-11-05' }, ] ) except ClientError as e: print(e)
I then wrote the following Lambda function that runs every day and checks for any expired rules. The schedule is set up by a Cloudwatch Event’s rule.
import boto3 from datetime import datetime, timedelta from botocore.exceptions import ClientError def return_today(): now = datetime.now() return now def get_sg_rules(sg_id, old_date): client = boto3.client('ec2') response = client.describe_security_group_rules( Filters=[ { 'Name': 'group-id', 'Values': sg_id }, { 'Name': 'tag:dateAdded', 'Values': [old_date] } ], ) return response def lambda_handler(event, context): sg_list = ["xxxx", "xxx"] old_date = datetime.strftime(return_today() - timedelta(days=30), "%Y-%m-%d") print(old_date) for sg_rule in get_sg_rules(sg_list, old_date)['SecurityGroupRules']: try: client = boto3.client("ec2") response = client.revoke_security_group_ingress( GroupId=sg_rule['GroupId'], SecurityGroupRuleIds=[sg_rule['SecurityGroupRuleId']] ) print(response) print("Successfully deleted the rule") except ClientError as e: print(e) print("Failed to delete rule")
You’ll see that the code has a list of Security Groups to check. It compares the current date to that of 30 days previous. If the tag of “dateAdded” matches that previous date then we will go ahead and remove the rule.
I hope this helps you automate your AWS Accounts. Below are links to the code repository so you can edit the code as needed. Please share it with your friends if this helps you!
-
EC2 Reservation Notification
I realized today that I haven’t updated my EC2 reservations recently. Wondering why I never did this I came to understand that there was no way that I was getting notified that the reservations were expiring. I spent the day putting together a script that would look through my reservations, assess the time of their expiration, and then notify me if it was nearing my threshold of 3 weeks.
I put this together as a local script but it can also be adapted to run as a lambda function which is what I have it set up to do. As always, you can view my code below and on GitHub.
import boto3 from datetime import datetime, timezone, timedelta from botocore.exceptions import ClientError import os import json ec2_client = boto3.client("ec2", region_name="us-west-2") def get_reserved_instances(): response = ec2_client.describe_reserved_instances() reserved_instances = {} for reservedInstances in response['ReservedInstances']: reserved_instances.update({ reservedInstances['ReservedInstancesId']: { "ExpireDate": reservedInstances['End'], "Type": reservedInstances['InstanceType'] } }) return reserved_instances def determine_expirery(expirery_date): now = datetime.now(timezone.utc) delta_min = timedelta(days=21) delta_max = timedelta(days=22) if expirery_date - now >= delta_min and expirery_date - now < delta_max: return True else: return False #Send Result to SNS def sendToSNS(messages): sns = boto3.client('sns') try: send_message = sns.publish( TargetArn=os.environ['SNS_TOPIC'], Subject='EC2-Reservation', Message=messages, ) return send_message except ClientError as e: print("Failed to send message to SNS") print(e) if __name__ == "__main__": for reservation, res_details in get_reserved_instances().items(): if determine_expirery(res_details['ExpireDate']) == True: sns_message = {"reservation": reservation, "expires": res_details['ExpireDate'].strftime("%m/%d/%Y, %H:%M:%S")} sendToSNS(json.dumps(sns_message)) #
I have an SNS topic setup that is set to send messages to a Lambda function in the backend so I can format my messages and send them to a Slack channel for notifications.
If you have any questions, feel free to comment or message me on Twitter!
-
Security Group ID Finder
I have been working on deploying resources to a lot of AWS accounts lately where each account has the same network infrastructure. When deploying Lambdas, I had the common name of the security group but not the ID. I wrote this utility to get the security group ID for me quickly.
import boto3 import sys def get_security_group_id(common_name): ec2 = boto3.client("ec2", region_name="us-west-2") response = ec2.describe_security_groups() for security_group in response['SecurityGroups']: if security_group['GroupName'] == common_name: return security_group['GroupId'] if __name__ == '__main__': if sys.argv[1] == "help" or sys.argv[1] == "--help" or sys.argv[1] == "usage" or sys.argv[1] == "--usage": print("USAGE: python3 main.py <security group name>") else: sg_id = get_security_group_id(sys.argv[1]) if sg_id == None: print("Security Group Not found") else: print(sg_id)
This is a simple tool that can be used on your command line by doing:
python3 main.py <security group name>
I hope this helps speed up your deployments. Feel free to share the code with your friends and team!
-
A Dynamo Data Migration Tool
Have you ever wanted to migrate data from one Dynamo DB table to another? I haven’t seen an AWS tool to do this so I wrote one using Python.
A quick walk through video import sys import boto3 ## USAGE ############################################################################ ## python3 dynamo.py <Source_Table> <destination table> ## ## Requires two profiles to be set in your AWS Config file "source", "destination" ## ##################################################################################### def dynamo_bulk_reader(): session = boto3.session.Session(profile_name='source') dynamodb = session.resource('dynamodb', region_name="us-west-2") table = dynamodb.Table(sys.argv[1]) print("Exporting items from: " + str(sys.argv[1])) response = table.scan() data = response['Items'] while 'LastEvaluatedKey' in response: response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey']) data.extend(response['Items']) print("Finished exporting: " + str(len(data)) + " items.") return data def dynamo_bulk_writer(): session = boto3.session.Session(profile_name='destination') dynamodb = session.resource('dynamodb', region_name='us-west-2') table = dynamodb.Table(sys.argv[2]) print("Importing items into: " + str(sys.argv[2])) for table_item in dynamo_bulk_reader(): with table.batch_writer() as batch: response = batch.put_item( Item=table_item ) print("Finished importing items...") if __name__ == '__main__': print("Starting Dynamo Migrater...") dynamo_bulk_writer() print("Exiting Dynamo Migrator")
The process is pretty simple. First, we get all of our data from our source table. We store this in a list. Next, we iterate over that list and write it to our destination table using the ‘Batch Writer’.
The program has been tested against tables containing over 300 items. Feel free to use it for your environments! If you do use it, please share it with your friends and link back to this article!
-
Querying and Editing a Single Dynamo Object
I have a workflow that creates a record inside of a DynamoDB table as part of a pipeline within AWS. The record has a primary key of the Code Pipeline job. Later in the pipeline I wanted to edit that object to append the status of resources created by this pipeline.
In order to do this, I created two functions. One that first returns the item from the table and the second that actually does the update and puts the updated item back into the table. Take a look at the code below and utilize it if you need to!
import boto3 from boto3.dynamodb.conditions import Key def query_table(id): dynamodb = boto3.resource('dynamodb') table = dynamodb.Table('XXXXXXXXXXXXXX') response = table.query( KeyConditionExpression=Key('PRIMARYKEY').eq(id) ) return response['Items'] def update_dynanmo_status(id, resource_name, status): dynamodb = boto3.resource('dynamodb') table = dynamodb.Table('XXXXXXXXXXXXX') items = query_table(id) for item in items: # Do your update here response = table.put_item(Item=item) return response
-
Pandas & NumPy with AWS Lambda
Fun fact: Pandas and NumPy don’t work out of the box with Lambda. The libraries that you might download from your development machine probably won’t work either.
The standard Lambda Python environment is very barebones by default. There is no point in loading in a bunch of libraries if they aren’t needed. This is why we package our Lambda functions into ZIP files to be deployed.
My first time attempting to use Pandas on AWS Lambda was in regards to concatenating Excel files. The point of this was to take a multi-sheet Excel file and combine it into one sheet for ingestion into a data lake. To accomplish this I used the Pandas library to build the new sheet. In order to automate the process I setup an S3 trigger on a Lambda function to execute the script every time a file was uploaded.
And then I ran into this error:
[ERROR] Runtime.ImportModuleError: Unable to import module 'your_module': IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE! Importing the numpy c-extensions failed.
I had clearly added the NumPy library into my ZIP file:
So what was the problem? Well, apparently, the version of NumPy that I downloaded on both my Macbook and my Windows desktop is not compatible with Amazon Linux.
To resolve this issue, I first attempted to download the package files manually from PyPi.org. I grabbed the latest “manylinux1_x86_x64.whl” file for both NumPy and Pandas. I put them back into my ZIP file and re-uploaded the file. This resulted in the same error.
THE FIX THAT WORKED:
The way to get this to work without failure is to spin up an Amazon Linux EC2 instance. Yes this seems excessive and it is. Not only did I have to spin up a new instance I had to install Python 3.8 because Amazon Linux ships with Python 2.7 by default. But, once installed you can use Pip to install the libraries to a directory by doing:
pip3 install -t . <package name>
This is useful for getting the libraries in the same location to ZIP back up for use. You can remove a lot of the files that are not needed by running:
rm -r *.dist-info __pycache__
After you have done the cleanup, you can ZIP up the files and move them back to your development machine, add your Lambda function and, upload to the Lambda console.
Run a test! It should work as you intended now!
If you need help with this please reach out to me on social media or leave a comment below.
-
Concatenating Multi-Sheet Excel Files with Python
I recently came across a data source that used multi-sheets within an Excel file. My dashboard cannot read a multi-sheet Excel file so I needed to combine them into one sheet.
The file is being uploaded into an S3 bucket and then needs to move through the data lake to be read into the dashboard. The final version of this script will be a Lambda function that is triggered on upload of the file, concatenate the sheets, and then place a new file into the next layer of the data lake.
Using Pandas you can easily accomplish this task. One issue I did run into is that Pandas no longer will read XLSX files so I did have to convert it down into an XLS file which is easily done through Excel. In the future this will also have to be done programmatically. Let’s get into the code.
import pandas as pd workbook = pd.ExcelFile('Yourfile.XLS') sheets = ['create', 'a', 'list'] dataframe = []import pandas as pd workbook = pd.ExcelFile('file.xls') sheets = ['create', 'a', 'list'] dataframe = [] for sheet in sheets: df = pd.read_excel(workbook, sheet_name=sheet, skiprows=[list of rows to skip], skipfooter=number_of_rows_to_skip_from_bottom) df.columns = ['list', 'of', 'column', 'headers'] dataframe.append(df) df = pd.concat(dataframe) df.to_excel("output.xls", index=False)
To start we are going to import the Pandas library and then read in our Excel file. In the future revision of this script I will be reading in the file from S3 through the Lambda event so this will need to change.
The “sheets” variable is a list of sheets that you want the script to look at. You can remove this if you want it to look at all the sheets. My file had a few sheets that could be ignored. We will also create an empty list called “dataframe”. This empty list will be used to store each of the sheets that we want to concatenate. In the production version of this script there is some modifications that need to be done on each sheet. I accomplished this by adding in “if/then” statements based on the sheet name.
At the end of the “for” loop we will append the data frame into our empty list. Once all the sheets have been added, we will use Pandas to concatenate the objects and output the file. You can specify your output file name. I also included the “index=false” which removes the first column of index numbers. This is not needed for my project.
So there you have it, a simple Python script to concatenate a multi-sheet Excel file. If this script helps you please share it with your network!
-
Where Is It 5 O’Clock Pt: 4
As much as I’ve scratched my head working on this project it has been fun to learn some new things and build something that isn’t infrastructure automation. I’ve learned some frontend web development some backend development and utilized some new Amazon Web Services products.
With all that nice stuff said I’m proud to announce that I have built a fully functioning project that is finally working the way I intended it. You can visit the website here:
To recap, I bought this domain one night as a joke and thought “Hey, maybe one day I’ll build something”. I started off building a fully Python application backed by Flask. You can read about that in Part 1.This did not work out the way I intended as it did not refresh the timezones on page load. In part 3 I discussed how I was rearchitecting the project to include an API that would be called upon page load.
The API worked great and delivered two JSON objects into my frontend. I then parsed the two JSON objects into two separate tables that display where you can be drinking and where you probably shouldn’t be drinking.
This is a snippet of the JavaScript I wrote to iterate over the JSON objects while adding them into the appropriate table:
function buildTable(someinfo){ var table1 = document.getElementById('its5pmsomewhere') var table2 = document.getElementById('itsnot5here') var its5_json = JSON.parse(someinfo[0]); var not5_json = JSON.parse(someinfo[1]); var its5_array = [] var not5_array = [] its5_json['its5'].forEach((value, index) => { var row = `<tr> <td>${value}</td> <td></td> </tr>` table1.innerHTML += row }) not5_json['not5'].forEach((value, index) => { var row = `<tr> <td></td> <td>${value}</td> </tr>` table2.innerHTML += row })
First I reference two different HTML tables. I then parse the JSON from the API. I take both JSON objects and iterate over them adding the timezones into the table and then returning them into the HTML table.
If you want more information on how I did this feel free to reach out.
I want to continue iterating over this application to add new features. I need to do some standard things like adding Google Analytics so I can track traffic. I also want to add a search feature and a map that displays the different areas of drinking acceptability.
I also am open to requests. One of my friends suggested that I add a countdown timer to each location that it is not yet acceptable to be drinking.
Feel free to reach out in the comments or on your favorite social media platform! And as always, if you liked this project please share it with your friends.
-
Where Is It Five O’Clock Pt: 3
So I left this project at a point where I felt it needed to be re-architected based on the fact that Flask only executes the function once and not every time the page loads.
I re-architected the application in my head to include an API that calls the Lambda function and returns a list of places where it is and is not acceptable to be drinking based on the 5 O’Clock rules. These two lists will be JSON objects that have a single key with multiple values. The values will be the timezones appropriate to be drinking in.
After the JSON objects are generated I can reference them through the web frontend and display them in an appropriate way.
At this point I have the API built out and fully funcitoning the way I think I want it. You can use it by executing the following:
curl https://5xztnem7v4.execute-api.us-west-2.amazonaws.com/whereisit5
I will probably only have this publically accessible for a few days before locking it back down.
Hopefully, in part 4 of this series, I will have a frontend demo to show!
-
Where Is It 5 O’Clock Pt: 2
So I spend the evening deploying this web application to Amazon Web Services. In my test environment, everything appeared to be working great because every time I reloaded the page it reloaded the function as well.
When I transferred this over to a live environment I realized the Python function only ran every time I committed a change and it was re-deployed to my Elastic Beanstalk environment.
This poses a new problem. If the function doesn’t fire every time the page is refreshed the time won’t properly update and it will show incorrect areas of where it is 5 O’Clock. Ugh.
So, over the next few weeks, in my spare time, I will be re-writing this entire application to function the way I intended it to.
I think to do this I will write each function as an AWS Lambda function and then write a frontend that calls these functions on page load. Or, the entire thing will be one function and return the information and it will deploy in one API call.
I also really want to display a map that shows the areas that it is 5PM or later but I think this will come in a later revision once the project is actually functioning correctly. Along with some more CSS to make it pretty and responsive so it works on all devices.
The punch list is getting long…
Follow along here: https://whereisitfiveoclock.net