Emperor of the Renaud Empire (population 6). Print/interactive designer for StL Post-Dispatch. Lover of new tech, Rememberer of old tech. Texan in exile.
32 stories

Some notes on self-publishing a tech book

1 Comment

So my book, Data Science for Crime Analysis with Python, is finally out for purchase on my Crime De-Coder website. Folks anywhere in the world can purchase a paperback or epub copy of the book. You can see this post on Crime De-Coder for a preview of the first two chapters, but I wanted to share some of my notes on self publishing. It was some work, but in retrospect it was worth it. Prior books I have been involved with (Wheeler 2017; Wheeler et al. 2021) plus my peer review experience I knew I did not need help copy-editing, so the notes are mostly about creating the physical book and logistics of selling it.

Academics may wish to go with a publisher for prestige reasons (I get it, I was once a professor as well). But it is quite nice once you have done the legwork to publish it yourself. You have control of pricing, and if you want to make money you can, or have it cheap/free for students.

Here I will detail some of the set up of compiling the book, and then the bit of work to distribute it.

Compiling the documents

So the way I compiled the book is via Quarto. I posted my config notes on how to get the book contents to look how I wanted on GitHub. Quarto is meant to run code at the same time (so works nicely for a learning to code book). But even if I just wanted a more typical science/tech book with text/images/equations, I would personally use Quarto since I am familiar with the set up at this point. (If you do not need to run dynamic code you could do it in Pandoc directly, not sure if there is a way to translate a Quarto yaml config to the equivalent Pandoc code it turns into.)

One thing that I think will interest many individuals – you write in plain text markdown. So my writing looks like:

# Chapter Heading

blah, blah blah

## Subheading

Cool stuff here ....

In a series of text files for each chapter of the book. And then I tell Quarto quarter render, and it turns my writing in those text files into both an Epub and a PDF (and other formats if you cared, such as word or html). You can set up the configuration for the book to be different for the different formats (for example I use different fonts in the PDF vs the epub, nice fonts in one look quite bad in the other). See the _quarto.yml file for the set up, in particular config options that are different for both PDF and Epub.

One thing is that ebooks are hard to format nicely – if I had a book I wanted to redo to be an epub, I would translate it to markdown. There are services online that will translate, they will do a bad job though with scientific texts with many figures (and surely will not help you choose nice fonts). So just learn markdown to translate. Folks who write in one format and save to the other (either Epub/HTML to PDF, or PDF to Epub/HTML) are doing it wrong and the translated format will look very bad. Most advice online is for people who have just books with just text, so science people with figures (and footnotes, citations, hyperlinks, equations, etc.) it is almost all bad advice.

So even for qualitative people, learning how to write in markdown to self-publish is a good skill to learn in my opinion.

Setting up the online store

For awhile I have been confused how SaaS companies offer payment plans. (Many websites just seem to copy from generic node templates.) Looking at the Stripe API it just seems over the top for me to script up all of my own solution to integrate Stripe directly. If I wanted to do a subscription I may need to figure that out, but it ended up being for my Hostinger website I can set up a sub-page that is WordPress (even though the entire website is not), and turn on WooCommerce for that sub-page.

WooCommerce ends up being easy, and you can set up the store to host web-assets to download on demand (so when you purchase it generates a unique URL that obfuscates where the digital asset is saved). No programming involved to set up my webstore, it was all just point and click to set things up one time and not that much work in the end.

I am not sure about setting up any DRM for the epub (so in reality people will purchase epub and share it illegally). I don’t know of a way to prevent this without using Amazon+Kindle to distribute the book. But the print book should be OK. (If there were a way for me to donate a single epub copy to all libraries in the US I would totally do that.)

I originally planned on having it on Amazon, but the low margins on both plus the formatting of their idiosyncratic kindle book format (as far as I can tell, I cannot really choose my fonts) made me decide against doing either the print or ebook on Amazon.

Print on Demand using LuLu

For print on demand, I use LuLu.com. They have a nice feature to integrate with WooCommerce, the only thing I wish shipping was dynamically calculated. (I need to make a flat shipping rate for different areas around the globe the way it is set up now, slightly annoying and will change the profit margins depending on area.)

LuLu is a few more dollars to print than Amazon, but it is worth it for my circumstance I believe. Now if I had a book I expected to get many “random Amazon search buys” I could see wanting it on Amazon. I expect more sales will be via personal advertising (like here on the blog, social media, or other crime analyst events). My Crime De-Coder site (and this blog) will likely be quite high in google searches for some of the keywords fairly quickly, so who knows, maybe just having on personal site is just as many sales.

LuLu does has an option to turn on distribution to other wholesalers (like Barnes & Noble and Amazon) – have not turned that on but maybe I will in the future.

LuLu has a pricing calculator to see how much to print on their website. Paperback and basically the cheapest color option for letter sized paper (which is quite large) is just over $17 for my 310 page book (Amazon was just over $15). For folks if you are less image heavy and more text, you could get away with a smaller size book (and maybe black/white) and I suspect will be much cheaper. LuLu’s printing of this book is higher quality compared to Amazon as well (better printing of the colors and nicer stock for the paperback cover).

Another nice thing about print on demand is I can go in and edit/update the book as I see fit. No need to worry about new versions. Not sure what that exactly means for citing the work (I could always go and change it), you can’t have a static version of record and an easy way to update at the same time.

Other Random Book Stuff

I purchased ISBNs on Bowker, something like 10 ISBNs for $200. (You want a unique ISBN for each type of the book, so you may want three in the end if you have epub/paperback/hardback.) Amazon and LuLu though have options to have them give you an ISBN though, so that may have not been necessary. I set the imprint to be my LLC though in Bowker, so CRIME De-Coder is the publisher.

You don’t technically need an ISBN at all, but it is a simple thing, and there may be ways for me to donate to libraries in the future. (If a University picks it up as a class text, I have been at places you need at least one copy for rent at the Uni library.)

I have not created an index – I may have a go at feeding my book through LLMs and seeing if I can auto-generate a nice index. (I just need a list of key words, after that can just go and find-replace the relevent text in the book to fill in so it auto-compiles an index.) I am not sure that is really necessary though for a how-to book, you should just look at the table of contents to see the individual (fairly small) sections. For epub you can just doing a direct text search, so not sure if people use an index at all in epubs.

Personal Goals

So I debated on releasing the book open source, I do want to try and see if I can make some money though. I don’t have this expectation, but there is potential to get some “data science” spillover, and if that is the case sales could in theory be quite high. (I was surprised in searching the “data science python” market on Amazon, it is definitely not saturated.) Personally I will consider at least 100 sales to be my floor for success. That is if I can sell at least 100 copies, I will consider writing more books. If I can’t sell 100 copies I have a hard time justifying the effort – it would just be too few of people buying the book to have the types of positive spillovers I want.

To make back money relative to the amount of work I put in, I would need more than 1000 sales (which I think is unrealistic). I think 500 sales is about best case, guesstimating the size of the crime analyst community that may be interested plus some additional sales for grad students. 1000 sales it would need to be in the multiple professors using it for a class book over several years. (Which if you are a professor and interested in this for a class let me know, I will give your class a discount.)

Another common way for individuals to make money off of books is not for sales, but to have training’s oriented around the book. I am hoping to do more of that for crime analysts directly in the future, but those opportunities I presume will be correlated with total sales.

I do enjoy writing, but I am busy, so cannot just say “I am going to drop 200 hours writing a book”. I would like to write additional python topics oriented towards crime analysts/criminology grad students like:

  • GIS analysis in python
  • Regression
  • Machine Learning & Optimization
  • Statistics for Crime Analysis
  • More advanced project management in python

Having figured out much of this grunt work definitely makes me more motivated, but ultimately in the end need to have a certain level of sales to justify the effort. So please if you like the blog pick up a copy and tell a friend you like my work!


Read the whole story
4 days ago
Excellent how-to with ideas for publishing a book.
Ferguson, MO, USA
Share this story

Obsessive Astros Fans Documented Their 2017 Cheating

2 Comments and 3 Shares

Tony Adams:

My name is Tony Adams. I’m an Astros fan. In November 2019, when the videos of the banging during some Astros 2017 games came out, I was horrified. It was clear within a minute of watching it was true — my team had cheated. To understand the scope of the cheating and the players involved, I decided to look at each home game from that season and determine any audio indicators of the sign stealing.

I wrote an application that downloaded the pitch data from MLB’s Statcast. This data has a timestamp for every pitch. I then downloaded the videos from YouTube and, using the timestamp, created a spectrogram for every pitch. A spectrogram is a visual representation of the spectrum of frequencies in an audio file. I could then playback the video of the pitches and, helped by the visual of the spectrogram, determine if there was any banging before the pitch.

I initially thought it would be quick work, and the application did make it pretty straightforward, but there are a lot of pitches in an MLB season. I ended up watching and logging over 8,200 pitches. And some more than once to be sure I was as accurate as possible.

I love everything about this. The obsession, the presentation of the data, and most of all, the fact that Adams is an Astros fan, and rather than make excuses for his team’s cheating, he’s upset by it.

One bit that came of this. David Spampinato:

On August 4th, the game with the most trash can bangs, the Astros scored 16 earned runs. Mike Bolsinger, a Blue Jays reliever, allowed 4 earned runs in 0.1 IP. He never pitched in the big leagues again.

What a disgrace. MLB should strip the Astros of their World Series title.

Read the whole story
1635 days ago
Love this data presentation.
Ferguson, MO, USA
Share this story


1 Comment and 2 Shares

If you’ve got a soft spot for vintage ’80s vector-graphic video games like Star Wars and Battlezone, you’re going to love this new short film by Stu Maschwitz. So great. Also, a fantastic 20-minute video on how it was made.

Read the whole story
2252 days ago
Awesome short film.
Ferguson, MO, USA
Share this story

New parallax ANSImation: Millennium Falcon dodging asteroids

1 Comment
I want to push boundaries. That’s what the original Star Wars films did. Industrial Light & Magic revolutionized special effects with novel new techniques for motion control and amazing model work. When I work on ANSI projects now, I try to think about ways to do things in ANSI that weren’t possible in the 1990s […]
Read the whole story
2615 days ago
Hope folks might get a kick out of this. Star Wars 40th anniversary + ANSI artists = awesome
Ferguson, MO, USA
Share this story

★ Safari vs. Chrome on the Mac


Eric Petitt, writing for The Official Unofficial Firefox Blog yesterday:

I head up Firefox marketing, but I use Chrome every day. Works fine. Easy to use. Like most of us who spend too much time in front of a laptop, I have two browsers open; Firefox for work, Chrome for play, customized settings for each. There are multiple things that bug me about the Chrome product, for sure, but I‘m OK with Chrome. I just don’t like only being on Chrome. […]

But talking to friends, it sounds more and more like living on Chrome has started to feel like their only option. Edge is broken. Safari and Internet Explorer are just plain bad. And unfortunately, too many people think Firefox isn’t a modern alternative.

In an update posted today, he walked that back:

In my original post I made a personal dig about Edge, IE and Safari: “Edge is broken. Safari and Internet Explorer are just plain bad.” I’ve since deleted that sentence.

It’s true, I personally don’t like those products, they just don’t work for me. But that was probably a bit too flip. And, if it wasn’t obvious that those were my personal opinions as a user, not those of the good folks at Firefox and Mozilla, then please accept my apology.

It’s easy when making an aside — and it’s clear that the central premise of this piece is about positioning Chrome as the Goliath to Firefox’s David, so references to Safari and IE are clearly asides — to conflate “I don’t like X” with “X is bad”. So I say we let it slide.1

But I’ve been meaning to write about Safari vs. Chrome for a while, and Petitt’s jab, even retracted, makes for a good excuse.

I think Safari is a terrific browser. It remains the one and only browser for the Mac that behaves like a native Mac app through and through. It may not be the fastest browser but it is fast. And its energy performance puts Chrome to shame. If you use a Mac laptop, using Chrome instead of Safari can cost you an hour or more of battery life per day.2

But Chrome is a terrific browser, too. It’s clearly the second-most-Mac-like browser for MacOS. It almost inarguably has the widest and deepest extension ecosystem. It has good web developer tools, and Chrome adopts new web development technologies faster than Safari does.

But Safari’s extension model is more privacy-conscious. For many people on MacOS, the decision between Safari and Chrome probably comes down which ecosystem you’re more invested in — iCloud or Google — for things like tab, bookmark, and history syncing. Me, personally, I’d feel lost without the ability to send tabs between my Macs and iPhone via Continuity.

In short, Safari closely reflects Apple’s institutional priorities (privacy, energy efficiency, the niceness of the native UI, support for MacOS and iCloud technologies) and Chrome closely reflects Google’s priorities (speed, convenience, a web-centric rather than native-app-centric concept of desktop computing, integration with Google web properties). Safari is Apple’s browser for Apple devices. Chrome is Google’s browser for all devices.

I personally prefer Safari, but I can totally see why others — especially those who work on desktop machines or MacBooks that are usually plugged into power — prefer Chrome. DF readers agree. Looking at my web stats, over the last 30 days, 69 percent of Mac users visiting DF used Safari, but a sizable 28 percent used Chrome. (Firefox came in at 3 percent, and everything else was under 1 percent.)3

As someone who’s been a Mac user long enough to remember when there were no good web browsers for the Mac, having both Safari and Chrome feels downright bountiful, and the competition is making both of them better.

  1. What really struck me about Petitt’s piece wasn’t the unfounded (to my eyes) dismissal of Safari, but rather his admission that he uses “Firefox for work, Chrome for play”. I really doubt the marketing managers for Chrome or Safari spend their days with a rival browser open for “play”, and even if they did, I expect they’d have the common sense not to admit so publicly, and especially not in the opening paragraph of a piece arguing that their own browser is a viable alternative to the rival one. ↩︎

  2. Back in December, when Consumer Reports rushed out their sensational report claiming bizarrely erratic battery life on the then-new MacBook Pros (which was eventually determined to be caused by a bug in Safari that Apple soon fixed), I decided to try to loosely replicate their test on the MacBook Pro review units I had from Apple. Consumer Reports doesn’t reveal the exact details of their testing, but they do describe it in general. They set the laptop brightness to a certain brightness value, then load a list of web pages repeatedly until the battery runs out. Presumably they automate this with a script of some sort, but they don’t say.

    That’s pretty easy to replicate in AppleScript. I used that day’s leading stories on TechMeme as my source for URLs to load — 26 URLs total. When a page loads, my script waits 5 seconds, and then scrolls down (simulating the Page Down key), waits another 5 seconds and pages down again, and then waits another 5 seconds before paging down one last time. This is a simple simulation of a person actually reading a web page. While running through the list of URLs, my script leaves each URL open in a tab. At the end of the list, it closes all tabs and then starts all over again. Each time through the loop the elapsed time and remaining battery life are logged to a file. (I also logged results as updates via messages sent to myself via iMessage, so I could monitor the progress of the hours-long test runs from my phone. No apps were running during the tests other than Safari, Script Editor, Finder, and Messages.)

    I set the display brightness at exactly 68.75 percent for each test (11/16 clicks on the brightness meter when using the function key buttons to adjust), a value I chose arbitrarily as a reasonable balance for someone running on battery power.

    Averaged (and rounded) across several runs, I got the following results:

    • 15-inch MacBook Pro With Touch Bar: 6h:50m
    • 13-inch MacBook Pro With Touch Bar: 5h:30m
    • 13-inch MacBook Pro (2014): 5h:10m
    • 11-inch MacBook Air (2011): 2h:15m

    I no longer had a new 13-inch MacBook Pro without the Touch Bar (a.k.a. the “MacBook Esc”) — I’d sent it back to Apple. I included my own personal 2014 13-inch MacBook Pro and my old 2011 MacBook Air just as points of reference. I think the Air did poorly just because it was so old and so well-used. It still has its original battery.

    I saw no erratic fluctuations in battery life across runs of the test. I procrastinated on publishing the results, though, and within a few weeks the whole thing was written off with a “never mind!” when Apple fixed the bug in Safari that was causing Consumer Reports’s erratic results.

    Anyway, the whole point of including these results in this footnote is that I also ran the exact same test with Chrome on the 13-inch MacBook Pro With Touch Bar. The average result: 3h:40m. That’s 1h:50m difference. On the exact same machine running the exact same test with the exact same list of URLs, the battery lasted almost exactly 1.5 times as long using Safari than Chrome.

    My test was in no way meant to simulate real-world usage. You’d have to be fueled up on some serious stimulants to read a new web page every 15 seconds non-stop for hours on end. But the results were striking. If you place a high priority on your MacBook’s battery life, you should use Safari instead of Chrome.

    If you’re interested, I’ve posted my battery testing scripts for Safari and Chrome↩︎︎

  3. If anyone has a good source for browser usage by MacOS users from a general purpose website like The New York Times or CNN, let me know. I honestly don’t know whether to expect that the split among DF readers is biased in favor of Safari because DF readers are more likely to care about the advantages of a native app, or biased in favor of Chrome because so many of you are web developers or even just nerdy enough to install a third-party browser in the first place. Wikimedia used to publish stats like that, but alas, ceased in 2015↩︎︎

Read the whole story
2617 days ago
I want use Safari on the Mac as my go-to browser, since it obviously syncs with my phone, etc. But I keep running into issues where Safari will freeze and the entire Mac becomes unresponses, forcing me to power it off and restart. So frustrating. But Chrome never does that. It can only crash itself.
Ferguson, MO, USA
2617 days ago
Do you have any extensions installed? Since dropping Flash support years ago, that doesn't normally happen in fairly heavy use for me with no extensions other than 1 Password
2616 days ago
Weird. Safari shouldn't be able to do that, either — it does use some private APIs but doesn't have any special privileges. Any user-mode app that can crash macOS is fundamentally a macOS bug, though it sounds like this one might be hard to isolate.
2616 days ago
I've seen that kind of behaviour in two cases – video driver issues tend to be obvious but I've also seen it in cases where apps start rapidly leaking memory and the system becomes unresponsive until the swap file fills up and the app crashes
2616 days ago
Pretty good description of the problem I'm describing here: https://discussions.apple.com/thread/7687562. I'm trying one of the recommended solutions.
Share this story
2 public comments
2616 days ago
Safari, baby. Used it to build NewsBlur and still use it everyday.
Cambridge, Massachusetts
2617 days ago
Firefox on Android allows you to use ublock origin, whereas Chrome on android doesn't... so Firefox on Android has become my default browser on there, but Chrome stays as default for me on macos.
New York, NY
2617 days ago
yeah as mediocre as FF-on-android is, the ability to install extensions is the reason I use it.

Pinterest Acquires Instapaper


Instapaper CEO Brian Donohue, on Hacker News:

Based on the comments I’ve read below the main concerns seem to be that Instapaper will either be shutdown or materially changed in a way that effects the end-user experience. I can tell you that neither of those are the plan for the short-term or long-term of the product, and I am personally looking forward to providing you with the same great service under a new owner.

We’ll see. Pinboard developer Maciej Ceglowski:

The “we sold to Pinterest but nothing is changing” email is Instapaper’s equivalent of reassuring grandma about her move to a nursing home.

Read the whole story
2890 days ago
Earlier this year I wrote an app for a friend with low vision that used Instapaper's new "instaparser" API to extract content from news sites, and read that content aloud. Now the API is being shut down and who knows if Instaparser itself will continue.
Ferguson, MO, USA
Share this story
1 public comment
2891 days ago
I'm kind of glad I went mostly all-in on Pocket now. I hope Instapaper stays as it is (and I'll continue to pay for it), but I can't really see it long-term.
Littlehampton, UK
2890 days ago
Do you use the paid version of Pocket? I've been saving things to pocket for... years(?) and have yet to use it like I used del.icio.us before Yahoo ruined it. Maybe I'm not using Pocket corrextly but there's not much of a UI and doesn't give me good results unless I've tagged everything (which is probably fair enough).
Next Page of Stories