Coding mistakes I've made
Here I will list some of the simple bugs, which cost me lots and lots of time to debug
Database multi insert
Muti inserts are used to speed up inserting records into a database. You can essentially insert multiple records on one database call. Saves connections and cursors.
-- Single insert
INSERT INTO user (name) VALUES ("Chris")
-- Multi insert
INSERT INTO user (name) VALUES ("Chris"), ("John Doe")
I wrote a chunking mechanism in Python which would build up a list of insert values, and insert them all in one call, then continue onto the next chunk.
The code below has been simplified to bring across the idea, rather than the implementation
insertable_values = range(1, 1234)
with get_db_connection() as connection:
with connection.cursor() as cursor:
# A list of values for the next insert
insert_values = []
for insertable in insertable_values:
# Compile a escaped insert value string
insert_string = cursor.mogrify("(%s)", [insertable]).decode('utf-8')
# Add the insert string to the values we want to insert
insert_values.append(insert_string)
# When we get to 100 values to insert, run the insert, and clear the insert list so we can start fresh
if len(insert_values) >= 100:
cursor.execute("INSERT INTO some_table (number) VALUES {}".format(", ".join(insert_values)))
connection.commit()
insert_values = []
# Once the for loop is done,
# We need to insert the remaining values in the array
cursor.execute("INSERT INTO some_table (number) VALUES {}".format(", ".join(insert_values)))
There is a bug in this code which took me an hour and a half to find.
Most runs would pass, but a small hand full would fail.
The reason for this is we compile lists of 100 items, then insert them, and when the list is done, we take the remaining items (which don't fall in the 100 grouping), and insert them as the "leftovers".
But if the list of values to insert is a multiple of 100, say 300, the inner loop will run 3 times, and when we run the remaining, there are 0 items left, and the insert statement will be malformed.
The fix is quite simple. Wrap the Remainder insert in a if statement to make sure there are items left.
insertable_values = range(1, 1234)
with get_db_connection() as connection:
with connection.cursor() as cursor:
# A list of values for the next insert
insert_values = []
for insertable in insertable_values:
# Compile a escaped insert value string
insert_string = cursor.mogrify("(%s)", [insertable]).decode('utf-8')
# Add the insert string to the values we want to insert
insert_values.append(insert_string)
# When we get to 100 values to insert, run the insert, and clear the insert list so we can start fresh
if len(insert_values) >= 100:
cursor.execute("INSERT INTO some_table (number) VALUES {}".format(", ".join(insert_values)))
connection.commit()
insert_values = []
# Once the for loop is done,
# We need to insert the remaining values in the array
if len(insert_values) > 0:
cursor.execute("INSERT INTO some_table (number) VALUES {}".format(", ".join(insert_values)))
Python static property
When working with Python classes, I sometimes revert to my old habits, and place static properties on classes such as
class Person:
# Here is the static property
errors = []
def validate(self):
self.errors.append("some error")
if len(self.errors) > 0:
return False
return True
def get_errors(self):
return self.errors
This seems fine at the face of it. And when developing locally this does not cause any problems, as everytime you change code, the wsgi server restarts (`gunicorn --reload`)
But when you deploy this with gunicorn workers, the static property will persist across invocations. So if the previous Person object in this case, had errors added, the next invocation without errors will still have errors on the object as the static errors
persists.
The solve for this is to declare the errors property on init
class Person:
def __init__(self):
# The property is now initialized when the object is created
self.errors = []
def validate(self):
self.errors.append("some error")
if len(self.errors) > 0:
return False
return True
def get_errors(self):
return self.errors
The property will now be initialized to a empty array whenever the object is created.
I do still like declaring the static property as it makes it more easily readable for me which properties are available on the class, but I'll stick with constructor initialisation for now.