Fun with Python Dataclasses, Inheritance, and CSVs

A client asked me to do a bit of data collection and collation from a few sites he visits. Specifically, he was looking for a list of articles and their authors from popular sites in the circles wherein he operates.

So, as I often do, I reached for my beloved Python.

Ahem.

The goal was to create two CSV files: one with the articles, and the other with the authors. I wasn’t collecting a ton of data about either one, really—names, links, publication dates. Nothing fancy.

I created two dataclasses, titled correctly:

If you’re not hip to dataclasses in Python, do yourself a favor and change that. They’re new-ish as of version 3.7 and I use them in just about every project I work on these days, mostly because they require so much less typing. Check out Arjan’s 10-ish-minute intro video if you want a fast primer.

Anyway, I wanted to easily be able to create CSV files from both of these types. Being a longtime fan of The Pragmatic Programmer, I do my level best to adhere to the DRY Principle at all times.

Which meant that it was time to write a superclass. I called it CSVable because why the hell not.

From there, it was simply a matter of subclassing CSVable in the two dataclasses and writing the six lines of code to crap the whole mess into a couple of text files. Glorious, all of it.

The especially cool part was using the exposed __dataclass_fields__ dictionary to generate both the header row (using a class method) and the content row for each object. That bit about the header row is probably my favorite part; I didn’t know the first positional argument to a class method is the calling class—the subclass, in this case.

It’s like 8 lines of code, but it’s a simple, totally flexible little mix-in that I can use to CSV-ify pretty much any dataclass I write going forward.