Why I’m Learning to Say No (Even to Cool Stuff)

Really effective people say no to almost everything.

Lately, I’ve been thinking a lot about how I spend my time and energy. With new GenAI tools, it often feels like I can do the work of five people. My backlog practically clears itself. That kind of power is thrilling, at first. I can build faster than ever. It really does feel amazing. But then my momentum plummets when I realize a lot of what I created doesn’t actually move me forward in a meaningful way. It was fun, but not focused.

I’ve had to relearn how to say no to the 90 things that are interesting and enjoyable so I can say yes to the 10 that really matter. It’s an age-old lesson, but its relevance is heightened in this GenAI-driven era of software development.

Just because I can do more doesn’t mean everything is worth doing. GenAI massively boosts my output, but my time and attention are still finite. If I’m not intentional, I end up spending my best hours on things that look impressive but don’t really help me progress. The ability to do more has made it even more important to stay focused on what matters most.

Saying no isn’t always easy, especially when an idea feels exciting and GenAI makes the cost of implementation a small fraction of what it used to be. But now I pause and ask myself: “Is this actually moving me forward in a meaningful way?” If it’s not, save it for later.

Thoughts on Hackerrank's Big Sorting Problem

I enjoy using Hackerrank to sharpen my understanding of fundamental computer science concepts.

I recently solved their “Big Sorting” problem and would like to share my thoughts on it.

I will assume you know understand the problem already. Here is the problem statement in case you’re unfamiliar with it..

The problem seems to be fairly straightforward: sort a list of numbers represented as strings.

It’s easy to write a one liner in Python to accomplish this:

numbers_array = ['9', '1', '3']
sorted(list(map(lambda number: int(number), numbers_array)))

It’s just three easy steps:

  1. Convert each numeric string to an actual int object
  2. Create a list with the newly created ints
  3. Sort the list

This approach seems too easy since the problem only has a ~63% solve rate on Hackerrank. It can’t really be this easy, right?

Unfortunately, this naive approach doesn’t pass several of Hackerrank’s test cases because it’s too slow.

In what cases do you think this approach would be too slow?

It turns out that very large numbers make this approach too slow. I bought one of Hackerrank’s test cases that was failing, and realized that one of the numeric strings in the list of inputs was 988k characters long. That’s a really big number! Thankfully, Python 3’s int class can handle ints that size, but there’s a big catch: it takes a while to instantiate them. To be precise, it took 5.96 seconds to instantiate the 988k digit long number as an int.

The script below takes 6.16 seconds to execute, so about 96.8% of the execution time was taken up instantiating that massive int.

numbers_array = ['9', '1', '3', '9'*998000]
sorted(list(map(lambda number: int(number), numbers_array)))

Python 3’s int class just wasn’t created to handle numbers of that length. But the Decimal class handles it much faster. To be precise, Decimal('<insert 988k digit long number here>') only takes 0.06 seconds. This is ~98.7% faster than using int.

So, to solve the “Big Sorting” problem just use Decimal instead of int:

from decimal import Decimal
numbers_array = ['9', '1', '3', '9'*998000]
sorted(list(map(lambda number: Decimal(number), numbers_array)))

The smarter implementation takes 0.07 seconds to run on my CPU.

Alembic Migrations: How Execute Raw SQL On New Tables

Problem

You have Alembic migrations. The current migration creates a new table and you want to insert rows into that table using raw SQL.

You are trying to establish a connection with the database using a DBAPI such as psycopg2, but when you try to insert rows you see an error that says something like the table doesn’t exist yet.

Your migration might look something like this:

import psycopg2
import sqlalchemy as sa
from alembic import op

def upgrade():
    op.create_table(
        'person',
        sa.Column('id', sa.Integer(), nullable=False),
        sa.Column('name', sa.String(), nullable=False),
    )

    bind = op.get_bind()
    session = Session(bind=bind)

    people = ['Harry', 'Ron', 'Hermione']

    conn = psycopg2.connect('postgres://username:password@localhost/dbname')
    cursor = conn.cursor()

    insert_person_sql = "INSERT INTO person (name) VALUES ('{name}');"
    for person in people:
        cursor.execute(insert_person_sql.format(name=person)

    conn.commit()

When attempting to run the migration there is a failure on the line cursor.execute(...).

Why does this happen? I believe the reason for this is that Alembic manages migrations as a transaction. The Alembic DB session created the person table, but doesn’t commit the change to the database until the entire migration is complete.

Therefore, the other connection, conn, established with the psycopg2 library is not aware of the person table while the migration is still running.

Solution

If you find yourself in this situation, the solution may be to bind to the Alembic session. Then you can execute SQL in a context that is aware of the new person table.

The new code will look something like this:

import sqlalchemy as sa
from alembic import op
from sqlalchemy.orm import sessionmaker

Session = sessionmaker()

def upgrade():
    op.create_table(
        'person',
        sa.Column('id', sa.Integer(), nullable=False),
        sa.Column('name', sa.String(), nullable=False),
    )

    bind = op.get_bind()
    session = Session(bind=bind)

    people = ['Harry', 'Ron', 'Hermione']

    insert_person_sql = "INSERT INTO person (name) VALUES ('{name}');"
    for person in people:
        session.execute(insert_person_sql.format(name=person)

Note, Alembic components such as the revision and down_revision field have been omitted for brevity, as well as the downgrade() function.

Gotchas of Opening CSV in Excel

Sometimes Excel converts CSV integers into scientific notation and/or truncates decimal points. This behavior is undesirable in some cases. There is an easy way to avoid this (see below). But before we apply the solution we need to know: when exactly does this happen?

Integers

10 digits is safe for integers, 11 digits is converted to scientific notation.

Inserting a tab character before an 11 digit integer prevents Excel from converting the number to scientific notation.

Here is some Python code that will insert the tab for appropriate cell values.

if len(cell_value) > 10:
  return "\t%s" % s

For example, the CSV input would be:

somevalue,12345123451,anothervalue,1234512345

And the CSV output would be:

somevalue,	12345123451,anothervalue,1234512345

Notice that a tab was inserted before the integer that was greater than 10 digits long, while the integer that was exactly 10 digits long was untouched.

Decimals

15-n decimal places are kept the rest are thrown out, where n is the number of digits left of the decimal point. For example, if 2 digits are to the left of the decimal point then 13 digits are kept to the right of it.

Inserting a tab character before a decimal with more than 15 total digits will make Excel keep all digits.

Wrapping Up

These two behaviors when opening CSV in Excel are consistent regardless of the column location of the value. In other words, it can be at the beginning, end or middle of a series of columns.

In each case, inserting a tab character will stop Excel from converting integers to scientific notation and truncating decimals.

These tests were performed with Microsoft Excel for Mac, version 16.9 (180116).

Python Arrays: Extend vs Append

It is important to know the difference between extend and append when working with arrays in Python.

Let’s say we have an example array: my_array = [].

If an array is passed to append, (i.e. my_array.append([1, 2]), then my_array will become a 2D array. It will look like this: [[1,2]]. In other words, append won’t flatten the arrays together into a 1D array.

However, it is possible to combine two arrays like this and end up with a 1D array using Python’s extend method. Calling my_array.extend([1, 2]) will result in my_array looking like this: [1,2].

Here is an example using the Python REPL:

>>> a = [1]
>>> b = [1, 2]
>>> c = [1, 2, 3]
>>> a.append(b)
>>> a
[1, [1, 2]]
>>> c.extend(b)
>>> c
[1, 2, 3, 1, 2]

Python 3.5 introduced another way to combine arrays. The unpacking operator * can be used in the following way:

>>> [1, 2, 3, *[4, 5, 6]]
[1, 2, 3, 4, 5, 6]

Note, the unpacking approach does not work in Python 2.7.14.

How-To Run a Specific Test Only from a PyTest Suite

Put @pytest.mark.<identifier> above the test method.

Run $ pytest -v -m <identifier>.

<indentifier> can be anything you choose.

For example, the a test file may look like this:

import pytest

@pytest.mark.focus
def first_test():
  # do a test here
  print('first test')

def second_test():
  # do another test here
  print('second test')

And then you can run the following command: pytest -v -m focus. first test should be printed to stdout, indicating only the first test was run.

Don’t forget to import pytest.

The -v flag will make the output more verbose so you can see everything that is going on.

The -m flag expects one argument: a mark expression. It will tell pytest to only run tests matching the identifier given.