Sunday, November 12, 2017

On whiteboard coding interviews

I'm in a ranty mood this evening. Looking through my past, one thing that bothers me is the ritual called "whiteboarding".

I've taken and given a lot of these interviews. I personally find the process demeaning, dehumanizing, biased, and subjective. And if the company uses the terms "cultural fit" or "calibration" when teaching you how to whiteboard, be wary.

My first software development interview was in 1996. I walked in, showed my Game Developer Magazine articles and demos (in DOS of course), spoke with the developers and my potential manager, and they made me an offer. Looking back, I was so young, inexperienced and naive at 20 years old. It was a tough gig but we shipped a cool product (Montezuma's Return). There was no whiteboard, all that mattered was the work and results.

Anyhow, my interview at Blue Shift was similar. No whiteboard, just lots of meetings.

At Ensemble (Microsoft), I got a contract gig at first. This turned into a full-time gig. The interviews there were informal and very rarely (if ever) involved problem solving on a whiteboard.

Right before Ensemble, I also interviewed at Microsoft ATG. It was a stressful, heavy duty whiteboard interview with several devs. It was intense, and that night I fell asleep at the table of an unrelated dinner with friends. I got an offer, but Ensemble's was better. I later learned it was basically a form of "Trauma Bonding". Everyone else did it, so you had to go through it too to get "in". Overall, I remember the Microsoft engineers I interviewed with seemed to be all tired and somewhat stressed out, but they were very professional and respectful.

After Age3 shipped, I interviewed at Epic. I was tired from crunching on Age3, and was unprepared. It was the most horrific interview I've ever taken or seen. Incredibly unprofessional. The devs didn't want to be interviewing anyone. I flopped this interview (and probably dodged a bullet as the working conditions there at the time seemed really bad). Nobody at Ensemble knew I interviewed there, and I'm glad I didn't leave.

Years later, I interviewed at Valve. It was another exercise in Trauma Bonding. I was so stressed it was ridiculous, and I found Dune's "The Litany Against Fear" helpful. Somehow I got through, and looking back I think Gabe Newell (who visited Ensemble and met me there) might have helped get me in without my knowledge. I was lucky to get in at all, because I interviewed as a generalist. If I had interviewed as a graphics specialist I never could have gotten in. (Because at the time the gfx coders at Valve had a pact of sorts, and unless you were Carmack it was virtually impossible to survive the whiteboard. So many graphics specialists got turned down that after a while the high-ups at Valve took notice and changed things.)

Anyhow, one of my points is, I've been pretty lucky to get to work at these places. I learned a lot. Most of the companies I worked at didn't use whiteboarding. Interestingly, the cultures of the non-whiteboarding companies were much healthier.

I sometimes wonder: if I wasn't a white male, or overweight, with all other things unchanged, would I have got these gigs? I very highly doubt it.

I've implemented and shipped tons of algorithms, products, etc. But I hate whiteboarding.

I think the tech companies use this process to slow down horizontal movement between companies. It keeps labor in place, and developer wages/prices down. The "price" of moving between companies (in terms of stress, and potential "whiteboard defeat") is purposely held high. Independent of whether or not this is done purposely, this is the end result.

If you've got to whiteboard, it can't hurt to practice like crazy. And read a few whiteboard coding interview books. Also, tap your social network and find devs who interviewed at your target company, and ask them what happened. If companies are going to do this, at least make them put some effort into it.

One trick I've seen done: After a big layoff, a group of devs gets together and starts interviewing at various companies. After every dev interviews at a particular shop, everything about the interview, and the whiteboard questions, are discreetly shared with the group. The first dev to be sent to a particular company won't be expected to get in (and very well might not want to work there in the first place). Once developers start acting as a group the entire process gets "gamed" particularly effectively.

Wednesday, September 27, 2017

Things learned while running your own self-funded startup

Here's a brain dump of the things we've learned while running our business and shipping our first product (Basis).

My experience at Valve somewhat helped prepare me for doing this. Working at Valve was like a microcosm of working at your own company. You needed to find customers, interact with them, and figure out what was valuable to them. (You also needed to identify "competitors" and do your best to ignore or respond to whatever challenges they might throw your way.) Financial concerns weren't an issue, but time and your reputation at the company was. I noticed a feedback loop there: The more success you got at Valve, the easier it was to find projects to help out on. As you earned "Valve Bucks" doors got opened much easier.

Entering Valve with basically zero Valve Bucks was a big challenge. It wasn't enough to merely be a good engineer at Valve. If you were a good engineer with zero communication skills your chances at surviving and thriving when I was there were pretty low. If you acted like an asshole and didn't have many friends it didn't matter how good you were or how awesome your accomplishments were. People like this would be fired sooner or later.

Anyhow, running your own company has a number of additional challenges. There are no bi-weekly paychecks, no free lunches, no PTO, no yearly Hawaiian vacation, and no on-site lawyers. You are now in the real world, and you're leaving the high school like corporate drama behind. Everything, including staying financially solvent, is now your responsibility.

Some things we've learned:

1. This is a dramatically more mature way of working vs. full-timing. Your boss is basically the bank. Keeping your account in the green is like an optimization problem. If you fail you go under, or you wind up in the arms of potentially predatory investors.

2. You want a product ASAP. Contract work is basically linear income relative to time, while products can be exponential. Just choose a product and ship it. If it fails, try again and again, because the things you learned while working on the first product will help you immensely on your second.

3. Products can take a long time to develop and monetize. Contract work can bring in immediate income, but only a trickle. The big challenge is working on contracts to stay afloat in the short term, but also finding time to work on your product for long term success.

4. There are lots of ways to stay funded until your product takes off: You can use savings, loans from friends, investor funds, government grants, and income from short contracts. I would recommend staying away from investors as much as you can, because once you get in bed with investors you no longer totally own your company (and it can be basically taken away from you).

5. Every decision must be made extremely carefully. Bad decisions cost money.

6. The large companies move very slowly. Do not place any bets on getting paid quickly by large companies, no matter how happy they say they are with you.

So, at least with a software middleware product, I would first target small customers because they move more quickly.

7. If you have a product that a very large company really wants, they'll still do everything they can to delay purchasing it for the market price. They'll try to hire you or your partner(s) away individually, or they'll wait as long as possible to see if you encounter hard financial times and go under. They won't come and just offer to license your product or buy you out until they've exhausted all other possibilities.

8. If your product offers evaluation licenses, then be very careful with the eval time period. Some companies will purposely demand very long evals as a form of negotiation leverage.

9. A company can feign interest in licensing your product, get your lawyer bogged down negotiating terms of the license (or eval license), then pull away or suddenly change their mind at the last minute. This costs money. To avoid this, a "put up or shut up" mentality can help. Either the company accepts your eval license with little fuss, or just move on.

10. No Hard Sells: If the company you are negotiating with gets overly emotional about the terms in your eval license, then move on.  Either they want the value your product offers, or they don't.

11. Research pricing: Your competitor(s) will publicly advertise low-ball prices to help lure in customers, but once you start negotiating with them the price goes up (sometimes massively). Talk to your competitor's customers and just ask them what they actually paid, and you'll be amazed at how much software middleware is actually worth in the market.

The publicly advertised price is basically just for corporate programmers, who generally don't understand the true market value of their code when properly packaged as a product. The public price is optimized so programmers won't feel bad about how underpaid they are, but it won't be too low so the coders will still perceive the product as having sufficient value.

Research the concept of "Death Prices". If the price is too low, it won't be perceived as having enough value to bother with, and low prices won't sustain your efforts. Set the price sufficiently high and let the market set the actual price. Most likely, if you're a programmer, you'll set the price too low because you've been brainwashed into thinking that your software doesn't have much value.

Large companies will pay high prices to be first to use your software, if it's perceived to be groundbreaking or awesome enough.

12. Find a good lawyer. Get your eval and software license figured out early. This is going to cost money, so save up. A lawyer with patent and software license experience is a huge bonus.

13. Interactions with real customers is priceless. Ask them what they want. For us, we were amazed at all the different ways the open source predecessor of Basis (crunch) was utilized. We pivoted our strategy to RDO encoders based off customer feedback. Our long term roadmap is based off what customers are actually doing with our software right now.

14. Open source is forever: Be extremely careful releasing open source software. "Thou shall not release too much functionality or features as open source". Open sourcing your work is both a blessing and a curse, and can be actually dangerous from a patent troll perspective.

Open source is great, because potential customers will get a chance to try out your work without spending a dime or ever talking to you. This boosts your credibility. On the flip side, if you give away too much, you are basically competing against yourself when you attempt to monetize your work as a product.

Your open source release should be a demo of the product, and no more. Give things away with the goal of eventually converting the users of the free software into paying customers.

Even if you don't intend on turning the software into a product, always keep in mind ways it can be eventually monetized. Your time is worth something.

Talk to every user of your open software software that you can find. Gather intelligence about how they actually use your software.

15. Your company must appear and be stable at all costs. If one year your large and expensive GDC booth isn't present, people will notice and you'll lose business. Even the biggest game middleware vendors have had serious cashflow problems. One almost went under a few years ago until they were bailed out by a big player. Even the biggest players sometimes take contract work to stay in the green, because product income isn't reliable.

16. Align yourself well: Being associated with CoMotion Labs and Khronos was invaluable to us. At CoMotion we were exposed to tons of other startups, and this cultural immersion was valuable.

17. Perception and psychology is extremely important. If you're a programmer, you're probably going to suck at the skills needed to bring your software to market. Find a partner who compliments you well.

18. You need friends, inside and outside of companies. Make as many friends as you can.

19. Some big corps can be very nasty:

"OMG, you can't do this due to patents!"
"You'll never take off because your price is too high"
"You'll run out of money and just come work for us, so we'll just wait you out"
"You must work for us because we're going to write this software ourselves and that'll impact your market share"

Corporate programmers at these megacorps can be horrifically nasty. Also, study and become acutely aware of triangulation when dealing with large hierarchical companies.

20. Some large teams have egos and won't want to license your software because of it. The challenge to licensing software in this situation will be overcoming this institutional ego, or just waiting to see how things pan out.

21. Be aware that companies talk to each other. A larger corp can use a smaller corp to help establish prices.

22. If you're at a company with deep pockets and you want to really have influence, offer the potential of a very large license fee for key software middleware. It works.

23. Do not blindly sign NDA's. Get a checklist from your lawyer and read them very carefully. If you try to negotiate over a clause in the NDA that totally sucks and the company refuses to budge, then move on.

Also, the NDA negotiation process can be very revealing. If the company is hard to deal with at this stage, then it's safer to just move on.

24. Treat your employees well, and with respect. We don't have any employees, but we've learned a lot while talking to employees at other middleware companies.

25. Your company will be defined just as much (if not more) by the customers you turn down vs. the ones you take.

If you get a bad feeling from a potential customer, or they aren't respectful, or they treat you substantially differently vs. how they treat your partner, then it's probably best to move on. Selling software like this is actually the establishment of a relationship, and every new relationship you take on has both risks and rewards. We're careful about who we work with.

Thursday, August 17, 2017

Why crunch likes uncompressed texture data

We've recently gotten some interest in creating a RDO compressor specifically for already compressed textures, which is why I'm writing this.

crunch works best with (and is designed for) uncompressed RGBA texture data. You can feed crunch already compressed data (by compressing to DXT, unpacking the blocks, and throwing the unpacked pixels into the compressor), but it won't perform as well. Why you ask?

crunch uses top down clusterization on the block endpoints. It tries to create groups of blocks that share similar endpoints. Once it finds a group of blocks that seem similar enough, it then uses its DXT endpoint optimizers on these block clusters to create the near-optimal set of endpoints for that cluster. These clusters can be very big, which is why crunch/Basis can't use off the self DXT/ETC compressors which assume 4x4 blocks.

DXT/ETC are lossy formats, so there is no single "correct" encoding for each input (ignoring trivial inputs like solid-color blocks). There are many possible valid encodings that will look very similar. Because of this, creating a good DXT/ETC block encoder that also performs fast is harder than it looks, and adding additional constraints or requirements on top of this (such as rate distortion optimization on both the endpoints and the selectors) just adds to the fun.

Anyhow, imagine the data has already been compressed, and the encoder creates a cluster containing just a single block. Because the data has already been compressed, the encoder now has the job of determining exactly which endpoints were used originally to pack that block. crunch tries to do this for DXT1 blocks, but it doesn't always succeed. There are many DXT compressors out there, each using different algorithms. (crunch could be modified to also accept the precompressed DXT data itself, which would allow it to shortcut this problem.)

What if the original compressor decided to use less than 4 colors spaced along the colorspace line? Also, the exact method used to interpolate the endpoints colors is only loosely defined for DXT1. It's a totally solvable problem, but it's not something I had the time to work on while writing crunch.

Things get worse if the endpoint clusterization step assigns 2+ blocks with different endpoints to the same cluster. The compressor now has to find a single set of endpoints to represent both blocks. Because the input pixels have already been compressed, we're now forcing the input pixels to lie along a quantized colorspace line (using 555/565 endpoints!) two times in a row. Quality takes a nosedive.

Basis improves this situation, although I still favor working with uncompressed texture data because that's what the majority of our customers work with.

Another option is to use bottom-up clusterization (which crunch doesn't use). You first compress the input data to DXT/ETC/etc., then merge similar blocks together so they share the same endpoints and/or selectors. This approach seems to be a natural fit to already compressed data. Quantizing just the selector data is the easiest thing to do first.

Sunday, June 25, 2017


I'm by no means an expert on anything San Diego, having been there only around 1.5 months since leaving Seattle. I did spend 8 years in Seattle though, and here's what I think:

- Seattle is just way too dark of a city for me to live there year round. Here's Seattle vs. San Diego's sunshine (according to

The winter rain didn't bother me much at all. It was the lack of sun. (Hint to Seattle-area corporate recruiters: Fly in candidates from sunnier climates like Dallas to interview during July-August.)

- There's a constant background noise and auditory clutter to Seattle and the surrounding areas that's just getting louder and louder as buildings pop up and people (and their cars) move in.

Eventually this background noise got really annoying. Even downtown San Diego is surprisingly peaceful and quiet by comparison.

- Seattle's density is both a blessing and a curse. It's a very walkable city, so going without a car is possible if you live and work in the right places.

The eastside and westside buses can be incredibly, ridiculously over packed. Seattle needs to seriously get its public transportation act together.

- As a pedestrian, I've found Seattle's drivers to be so much nicer and peaceful on the road vs. San Diego's. CA drivers seem a lot more aggressive.

- San Diego is loaded with amazing beaches. Seattle - not so much.

A few misc. thoughts on Seattle and the eastside tech workers I encountered:

I lived and worked on the eastside (near downtown Bellevue) and westside (U District) for enough time to compare and contrast the two areas. The people in Seattle itself are generally quite friendly and easy going. Things seem to change quickly once you get to the eastside, which feels almost like a different state entirely.

I found eastside people to be much less friendly and living in their own little worlds. I wish I had spent more of my time living in Seattle itself instead of Bellevue. Culturally Bellevue feels cold and very corporate.

The wealthier areas on the eastside seemed the worse. Wealth and rudeness seem highly correlated. So far, I've yet to meet a Bellevue/Redmond tech 10-100 millionaire (or billionaire) that I found to be truly pleasant to be around or work with. I also learned over and over that there is only a weak correlation between someone's wealth and their ability to actually code. In many cases someone's tech wealth seemed to be related to luck of the draw, timing, personality, and even popularity. Some of the wealthiest programmers I met here were surprisingly weak software engineers.

I've seen this happen repeatedly over the years: Average software engineers get showered with mad cash and suddenly they turn inward, become raging narcissistic assholes, and firmly believe they and their code is godly. Money seems to bring out the worse personality traits in people.

Monday, June 19, 2017

Basis's RDO DXTc compression API

This is a work in progress, but here's the API to the new rate distortion optimizing DXTc codec I've been working on for Basis. There's only one function (excluding basis_get_version()): basis_rdo_dxt_encode(). You call it with some encoding parameters and an array of input images (or "slices"), and it gives you back a blob of DXTc blocks which you then feed to any LZ codec like zlib, zstd, LZHAM, Oodle, etc.

The output DXTc blocks are organized in simple raster order, with slice 0's blocks first, then slice 1's, etc. The slices could be mipmap levels, or cubemap faces, etc. For highest compression, it's very important to feed the output blocks to the LZ codec in the order that this function gives them back to you.

On my near-term TODO list is to allow the user to specify custom per-channel weightings, and to add more color distance functions. Right now it supports either uniform weights, or a custom model for sRGB colorspace photos/textures. Also, I may expose optional per-slice weightings (for mipmaps).

I'm shipping the first version (as a Windows DLL) tomorrow.

// File: basis_rdo_dxt_public.h
#pragma once

#include <stdlib.h>
#include <memory.h>

#define BASIS_DLL_EXPORT __declspec(dllexport)  

#if defined(_MSC_VER)
#define BASIS_CDECL __cdecl

namespace basis
    // The codec's current version number.
    const int BASIS_CODEC_VERSION = 0x0106; 
    // The codec can accept rdo_dxt_params's from previous versions for backwards compatibility purposes. This is the oldest version it accepts.

    typedef unsigned int basis_uint;
    typedef basis_uint rdo_dxt_bool;
    typedef float basis_float;

    enum rdo_dxt_format
        cRDO_DXT1 = 0,


    enum rdo_dxt_encoding_speed_t

    const basis_uint RDO_DXT_STRUCT_VERSION = 0xABCD0001;

    const basis_uint RDO_DXT_QUALITY_MIN = 1;
    const basis_uint RDO_DXT_QUALITY_MAX = 255;
    const basis_uint RDO_DXT_MAX_CLUSTERS = 32768;
    struct rdo_dxt_params
        basis_uint m_struct_size;
        basis_uint m_struct_version;

        rdo_dxt_format m_format;

        basis_uint m_quality;

        basis_uint m_alpha_component_indices[2];

        basis_uint m_lz_max_match_dist;

        // Output block size to use in RDO optimization stage, note this does NOT impact the blocks written to pOutput_blocks by basis_rdo_dxt_encode()
        basis_uint m_output_block_size;

        basis_uint m_num_color_endpoint_clusters;
        basis_uint m_num_color_selector_clusters;

        basis_uint m_num_alpha_endpoint_clusters;
        basis_uint m_num_alpha_selector_clusters;

        basis_float m_l;
        basis_float m_selector_rdo_quality_threshold;
        basis_float m_selector_rdo_quality_threshold_low;

        basis_float m_block_max_y_std_dev_rdo_quality_scaler;

        basis_uint m_endpoint_refinement_steps;
        basis_uint m_selector_refinement_steps;
        basis_uint m_final_block_refinement_steps;

        basis_float m_adaptive_tile_color_psnr_derating;
        basis_float m_adaptive_tile_alpha_psnr_derating;

        basis_uint m_selector_rdo_max_search_distance;

        basis_uint m_endpoint_search_height;
        basis_uint m_endpoint_search_width_first_line;
        basis_uint m_endpoint_search_width_other_lines;

        rdo_dxt_bool m_optimize_final_endpoint_clusters;
        rdo_dxt_bool m_optimize_final_selector_clusters;

        rdo_dxt_bool m_srgb_metrics;
        rdo_dxt_bool m_debugging;
        rdo_dxt_bool m_debug_output;
        rdo_dxt_bool m_hierarchical_mode;
        rdo_dxt_bool m_multithreaded;
        rdo_dxt_bool m_use_sse41_if_available;

    inline void rdo_dxt_params_set_encoding_speed(rdo_dxt_params *p, rdo_dxt_encoding_speed_t encoding_speed)
        if (encoding_speed == cEncodingSpeedFaster)
            p->m_endpoint_refinement_steps = 1;
            p->m_selector_refinement_steps = 1;
            p->m_final_block_refinement_steps = 1;

            p->m_selector_rdo_max_search_distance = 3072;
        else if (encoding_speed == cEncodingSpeedFastest)
            p->m_endpoint_refinement_steps = 1;
            p->m_selector_refinement_steps = 1;
            p->m_final_block_refinement_steps = 0;

            p->m_selector_rdo_max_search_distance = 2048;
            p->m_endpoint_refinement_steps = 2;
            p->m_selector_refinement_steps = 2;
            p->m_final_block_refinement_steps = 1;

            p->m_endpoint_search_width_first_line = 2;
            p->m_endpoint_search_height = 3;

            p->m_selector_rdo_max_search_distance = 4096;

    inline void rdo_dxt_params_set_to_defaults(rdo_dxt_params *p, rdo_dxt_encoding_speed_t default_speed = cEncodingSpeedFaster)
        memset(p, 0, sizeof(rdo_dxt_params));

        p->m_struct_size = sizeof(rdo_dxt_params);
        p->m_struct_version = RDO_DXT_STRUCT_VERSION;

        p->m_format = cRDO_DXT1;

        p->m_quality = 128;

        p->m_alpha_component_indices[0] = 0;
        p->m_alpha_component_indices[1] = 1;

        p->m_l = .001f;

        p->m_selector_rdo_quality_threshold = 1.75f;
        p->m_selector_rdo_quality_threshold_low = 1.3f;

        p->m_block_max_y_std_dev_rdo_quality_scaler = 8.0f;

        p->m_lz_max_match_dist = 32768;
        p->m_output_block_size = 8;

        p->m_endpoint_refinement_steps = 1;
        p->m_selector_refinement_steps = 1;
        p->m_final_block_refinement_steps = 1;

        p->m_adaptive_tile_color_psnr_derating = 1.5f;
        p->m_adaptive_tile_alpha_psnr_derating = 1.5f;
        p->m_selector_rdo_max_search_distance = 0;

        p->m_optimize_final_endpoint_clusters = true;
        p->m_optimize_final_selector_clusters = true;

        p->m_selector_rdo_max_search_distance = 3072;

        p->m_endpoint_search_height = 1;
        p->m_endpoint_search_width_first_line = 1;
        p->m_endpoint_search_width_other_lines = 1;

        p->m_hierarchical_mode = true;

        p->m_multithreaded = true;
        p->m_use_sse41_if_available = true;

        rdo_dxt_params_set_encoding_speed(p, default_speed);
    const basis_uint RDO_DXT_MAX_IMAGE_DIMENSION = 16384;

    struct rdo_dxt_slice_desc
        // Pixel dimensions of this slice. A slice may be a mipmap level, a cubemap face, a video frame, or whatever.
        basis_uint m_image_width;
        basis_uint m_image_height;
        basis_uint m_image_pitch_in_pixels;

        // Pointer to 32-bit raster image. Format in memory: RGBA (R is first byte, A is last)
        const void *m_pImage_pixels;

} // namespace basis

extern "C" BASIS_DLL_EXPORT basis::basis_uint BASIS_CDECL basis_get_version();
extern "C" BASIS_DLL_EXPORT basis::basis_uint BASIS_CDECL basis_get_minimum_compatible_version();

extern "C" BASIS_DLL_EXPORT bool BASIS_CDECL basis_rdo_dxt_encode(
    const basis::rdo_dxt_params *pEncoder_params,
    basis::basis_uint total_input_image_slices, const basis::rdo_dxt_slice_desc *pInput_image_slices,
    void *pOutput_blocks, basis::basis_uint output_blocks_size_in_bytes);

Sunday, April 30, 2017

Binomial stuff

One MS employee recently said to Stephanie (my partner) that (paraphrasing) "your company isn't stable and can't possibly last". My reply: We've been in business for over a year now, and our business is just a natural extension and continuation of our careers. I've been programming since 1985, and developing commercial data compression and other software since 1993. I've been doing this for a while and I'm not going to stop anytime soon.

Having my own small consulting company vs. just working full-time for a single corporation is just a natural next step to me. One thing I really liked about working at Valve was the ability to wheel my desk to virtually anywhere in the company and start adding value. I can now "wheel my desk" to anywhere in the world, and the freedom this gives us is amazing.

Binomial is a self-funded startup. We work on both development contracts and our current product (Basis). We haven't taken any investment money. Our "runway" is basically infinite.

Wednesday, April 19, 2017

Basis status

Just a small update. We've put like 99% of our effort into ETC1 and ETC1+DXT1 over the previous 5-6 months. Our ETC1 encoder supports RDO and an intermediate format, and has shipped on OSX/Linux/Windows. I've been modifying the ETC1 encoder to also support DXT1 (for our universal format) over the previous few weeks.

Our ETC1 encoder was written almost from scratch. The next major step is to roll back all the improvements and things I've learned while implementing our ETC1 encoder back into our DXT-specific encoder. crunch's support for DXT has a bunch of deficiencies which hurt ratio. (Roy Eltham and Fabian Giesen have recently pointed this issue out to me. I've actually been aware of inefficiencies in crunch's codebook generator for a few months, since working on the new codebook generator for ETC1.) I'm definitely fixing this problem (and others!) in Basis.

Saturday, March 18, 2017

Probiotic yogurt making

Just got back from a wonderful business trip to Portland Maine, visiting ForeFlight. Making more probiotic yogurt tonight because I ate up almost my entire stock on the trip. (It didn't help that we got stuck in a blizzard while there, but that turned out to be really fun.) The food in Portland is amazing!

The pot of boiling water is for sterilizing the growth medium, in this case 2% organic grassfed milk+raw sugar. After the milk is boiled (repastuerized) and cooled I inoculate it using a 10 strain probiotic blend from Safeway. I tried a bunch of probiotics before finding this particular brand, which seems magical for me. Without this extremely strong yogurt I have no idea how long it would have taken my gut to heal after the antibiotics I had to take in 2015.

Yogurt making like this is tricky. Early on, around 30% of my attempts failed in spectacular ways. These days my success rate is almost 100%. Sterilization of basically everything (including tools, spoons, etc.) over and over throughout the process is critical to success.

Nerd/Brogrammer spectrum

Okay, I've been watching Silicon Valley. This show is so realistic and exaggerated that I find it painful to watch at times (especially the first couple episodes), but it's also funny as hell. It really helps to put things into perspective about why I'm now a consultant and not "exclusive" to a single corporation anymore.

Some observations:

- Notice that the interpersonal relationships between basically all the characters are pretty toxic. Everyone seems to be exploiting everyone else in some way, and money/wealth is a large motivator to most characters.

- Look at the above frame. Where's the programmer in the middle in this series? The person who works out, is empathetic, loves to program, but isn't a total and complete asshole.

Friday, March 10, 2017

Virtual selector codebook example image

I'm going to post some examples over the next few days. Note these examples are not .basis compressed images, instead they are just ETC1 compressed images where each block's selectors have been replaced by the "best" corresponding virtual selector codebook entry. Each block was processed independently, using a brute force search on a 20 core workstation.

I used an 8-bit seed dictionary and 11-bits to control the amplification functions, so the virtual CB size was 512K entries. The seed dictionary and functions were tuned on the set of photos, test images, and textures used to tune crunch. My current implementation seems to work best on photos and textures, and worst on synthetic images and high-contrast text.

In .basis, this method is currently only used to compressed selector codebooks, not entire images, so a brute force search over the entire virtual codebook isn't too insane. Also, basis supports a hybrid codebook mode, so it can select between virtual VB entries and "raw" entries depending on quality.

(Posting links so blogger doesn't resample the images.)

ETC1 compressed with selectors chosen from a 512K virtual codebook - .977149 Luma SSIM

ETC1 compressed - .996428 Luma SSIM


Other ETC1 examples using vitual selector CB's (sorry no SSIM's, although I do have this data):

Blues Brothers

Sunday, March 5, 2017

Virtual global selector codebooks in Basis

Much of this information will be present in .basis's open source transcoder, so here's a little brain dump of how it works.

Selectors in DXT1 and ETC1 are 2-bits each. These values select which block color is used to represent each texel in a block. The block size in these formats is 4x4 texels, so each selector can be treated as a 16D vector. The uncompressed size of a selector vector is 32-bits (4*4*2).

One way of compressing these selector vectors is vector quantization (VQ). One problem with straightforward VQ is the expense of codebook storage. crunch stores selector codebooks in a compressed form, using order-1 delta compression on each vector's component, and it tries to rearrange the codebook entry order to minimize these deltas. With large codebooks the amount of bits needed to store the compressed selector vectors becomes prohibitive relative to the compressed texture data.

An alternative technique is to switch to a multi-level codebook scheme, where each .basis file has a "local" selector codebook which refers into a much larger constant "global" selector codebook. (This is somewhat inspired by Graham Devine's ROQ video file format used in Quake 3, which used multilevel codebooks.) Now we've got a serious memory problem, because the global (constant) selector codebook is going to require several megabytes of memory. A 1024^2 global selector codebook requires a 4MB table, which is undesirable on many platforms (i.e. WebGL).

.basis works around this problem by using a very small (256 entry) global selector vector codebook which is procedurally "amplified" using a series of small functions. These functions can rotate the vector by 90, 180, or 270 degrees, vertically flip the vector, change its contrast, add Gaussian noise to the vector, invert the vector, etc. It turns out that this method is surprisingly powerful and simple, and lends itself well to a hardware implementation.

To select a "virtual" selector codebook entry in this scheme, the encoder first selects a "seed" vector from the small global codebook (which requires 8-bits), then it specifies a series of control bits (typically 6-12) to select which procedural routines are used to modify the codebook entry. The control bits can be optionally arithmetically coded, or stored as-is. It's also possible to completely discard with the seed codebook and just use a PRNG (with some post processing) to generate the initial selector entries. In this case, the seed bits are used to prime the PRNG.

Function usage statistics in kodim18 (on each ETC1 block):

shift_x: Samples: 24576, Total: 12737.000000, Avg: 0.518270, Std Dev: 0.499666
shift_y: Samples: 24576, Total: 10472.000000, Avg: 0.426107, Std Dev: 0.494510
flip: Samples: 24576, Total: 10811.000000, Avg: 0.439901, Std Dev: 0.496375
rot: Samples: 24576, Total: 18679.000000, Avg: 0.760050, Std Dev: 0.427052
erode: Samples: 24576, Total: 6115.000000, Avg: 0.248820, Std Dev: 0.432329
dilate: Samples: 24576, Total: 7067.000000, Avg: 0.287557, Std Dev: 0.452623
high_pass: Samples: 24576, Total: 10076.000000, Avg: 0.409993, Std Dev: 0.491832
rand: Samples: 24576, Total: 12522.000000, Avg: 0.509521, Std Dev: 0.499909
div: Samples: 24576, Total: 10053.000000, Avg: 0.409058, Std Dev: 0.491660
shift: Samples: 24576, Total: 8016.000000, Avg: 0.326172, Std Dev: 0.468811
contrast: Samples: 24576, Total: 6080.000000, Avg: 0.247396, Std Dev: 0.431499
inv: Samples: 24576, Total: 12075.000000, Avg: 0.491333, Std Dev: 0.499925
median: Samples: 24576, Total: 7947.000000, Avg: 0.323364, Std Dev: 0.467760

rot's usage was ~76% because it was used 3/4 times. ~25% of the time the vector wasn't rotated. I've been continually surprised how easy it has been to find useful functions.

Here's a description of the current set of procedural functions in .basis:

  • shift_x/y: Shifts the block's selectors up or left by 1 row/column
  • flip: Vertical flip
  • rotate: Rotates by 0, 90, 180, or 270 degrees
  • erode: 3x3 erosion morphological  operator
  • dilate: 3x3 dilation morphological operator
  • high pass: 3x3 high-pass filter
  • rand: Adds Gaussian noise to the selectors, using the selectors themselves as a seed for the PRNG.
  • div: Selector remapping through table { 2, 0, 3, 1 }
  • shift: Adds 1 to the selectors with clamping
  • contrast: Boosts contrast of selectors by remapping them through 1 of 3 tables: { 0, 0, 3, 3 }, { 1, 1, 2, 2 }, { 1, 1, 3, 3 }
  • inv: Inverts the selectors
  • median: 3x3 selector median filter

The order that these functions are applied matters, and I'm still figuring out the optimal order. The control bits select which combination of the above functions is used to modify the selectors, and for a couple functions (like rot and contrast) multiple control bits are needed.

With a method like this it's possible to compress a selector vector down to 14-16 bits. Quality is extremely good. The biggest problem I'm working on solving now is how to efficiently search such large virtual codebooks during encoding without sacrificing too much quality or introducing artifacts. Full codebook searches are very slow.

Sunday, February 26, 2017

Interesting looking ETC1/2 encoder

This encoder uses SSIM as an image quality metric at the block level:

An article with details (in Russian):

I'll be adding it to the next run of my ETC benchmark.

Saturday, February 18, 2017


I love herbal adaptogens, especially Rhodiola, and Ashwagandha.  Adaptogens are a rare, special breed of herb. Pretty much every game and software developer should at least understand what adaptogens do. How I wish I knew about adaptogens over the past 20+ years!

Effects of Adaptogens on the Central Nervous System and the Molecular Mechanisms Associated with Their Stress—Protective Activity

With adaptogens, my stress response feels totally different. Check out this graph:

An external file that holds a picture, illustration, etc.
Object name is pharmaceuticals-03-00188-g001.jpg

Adaptogens are like "RadAway" for internal and/or external stress:

So far, the most powerful and reliable blend of adaptogens I've tried is "Adrenal Health - Daily Support" from Gaia. Their plain Rhodiola, Holy Basis, and Ashwagandha extracts are also very good. 

The downside to adaptogens: If you take too much, you're just not going to feel tired. If you *like* feeling tired, be cautious of dose. The slight CNS stimulating effect can be too energizing for some people, so go especially slow at first and only take a tiny amount only in the morning. If you just want to get your feet wet and play around with small amounts of adaptogens, try some Rhodiola extract in tea. So far, Ashwagandha feels the least stimulating to me.

Check out the anti-toxic effects of certain adaptogens. This is the kind of stuff James Bond surely takes every day:
"The adaptogens Eleutherococcus senticosus, Rhodiola rosea and Schisandra chinensis were reported to be safe in acute and subacute toxicity studies. Moreover, adaptogen induced state of non-specific resistance to highly toxic chemicals (e.g., chlorophos, phosphorus, cyclophosphane, strychnine, aniline, sodium nitrite, narcotics like sodium barbital, hexenal, chloralhydrate, benzene, acetone, ether, etc.) and microbes demonstrated in many pharmacological/toxicological studies [28], actually implies that they have an anti-toxic activity. For example, it was demonstrated that repeated administration of Rhodiola rosea extract during 10 consecutive days decreased LD50 of 40% ethanol in mice from 24.1 ml/kg to 55.2 ml/kg. It was also shown that salidroside shortened (from 100% to 19%) the duration of benzene induced sleep in rats"
What I wish I could find is a concise breakdown of the exact properties of each adaptogenic herb. For example, how stimulating is each one? How powerful is the cortisol lowing/normalizing effect of each, etc.

BTW - if you for some reason are freaked out by the concept of herbal medicine, the stimulating effect of caffeine feels much more powerful than any adaptogen I've tried. Also, adaptogens are not addictive like caffeine is. I have more consistent and reliable energy on adaptogens, and overall less energy when drinking caffeine.

Tuesday, January 24, 2017

"Hybrid" supercompressed texture approach

I think this is (vaguely to roughly - no idea?) similar to the new supercompressed texture algorithm John Brooks is working on, CEO of Blue Shift:

Instead of crunch-style top down clusterization (VQ) on the color endpoints and selector vectors with GPU format specific optimization and refinement stages, you could convert a pixel-wise codec (something like JPEG but hopefully better!) to directly output GPU block data. In ETC1, each subblock's colorspace line lies along the grayscale axis. The selector optimization/decision process is driven very strongly by the importance of luma vs. chroma error in ETC1. (And it's possible to trivially convert ETC1 data into high quality DXT1, so none of this info is specific to exclusively ETC1.)

The compressor can send "mode" bits to the decompression/transcoder system, depending on the importance of each bit of compressed data. The actual importance is related to a balance between bitrate and desired decompression/encoding time. If something is just way too expensive to do in the transcoder, then just do the computation in the encoder offline and send the results as some arithmetically coded bits with the right contexts.

And the encoder can make all key RDO decisions keeping in mind the destination GPU format (ETC1, DXT1, ASTC, etc.) and the transcoder's strengths/weaknesses.

It's also possible to combine crunch's clusterization+refinement algorithm to handle the chroma problem in this format. Use VQ on the luma/chroma endpoint information, and a pixel-wise approach on the luma to drive the selector decision process. I can imagine so many possibilities here!

Basis ETC1 support progress

The front-end is written, now all of my effort has been on designing an efficient intermediate file format for ETC1 and a decent enough (but an improvable) encoder to go along with it. The idea is to prioritize the development of a powerful enough intermediate file format itself first, along with a "hopefully good enough" encoder. So I've been quickly trying a bunch of different RDO-driven (mostly back end) optimizations, to find some interesting ones to exploit later down the line. Because once the transcoder ships I can't change it for a long time, if ever.

The next step is to add one more major feature, port it to Linux, and ship it to our first customer.

Thursday, January 19, 2017

Basis ETC1 intermediate format support is now round-tripping

We can now encode to our new compressed intermediate format for ETC1, and quickly transcode this compressed data to raw ETC1 bits. (Just like crunch's "CRN" format, but for ETC1.) This required many hundreds of compression and coding experiments to get here, and a bunch of work. The next major step is to port it to Linux and continue optimizing/tuning the encoder. We are freezing out file format for ETC1 within the next 2-3 weeks, and beyond that we'll just be tuning the encoder for higher quality output per bit.

For the first time, the backend of any encoder I've written supports explicit selector RDO. It basically tries to balance luma noise vs. bitrate. I've discovered a large number of improvements to crunch's basic approach. Each time I get tempted to try a completely different algorithm, I find another optimization.

Here's a random example of what selector RDO looks like on kodim18. At these quality settings, Basis ETC1 intermediate format gets 1.52 bits/texel, RDO compression (with the exact same quality level) on the ETC1 .KTX file+LZMA gets 1.83 bits/texel, and plain ETC1+LZMA is 3.33 bits/texel.

Delta selector RDO Enabled: 1.52 bits/texel Luma SSIM: .867811 (plain non-RDO ETC1 Luma SSIM is .981066)

Delta selector RDO Disabled: 1.61 bits/texel SSIM: .900380

RGB delta image (biased to 128):

The two textures above share the exact same endpoint/selector codebooks, the same per-macroblock mode values (flip and differential bits), and the same block color "endpoints". The delta selectors in the 2nd image were chosen to balance introduced noise vs. bitrate. I'm still tuning this algorithm but it's amazing to find improvements like this with so little code. This improvement and others like it will also be available in Basis's support for the DXT formats.

We also support an "ETC1S" mode, that doesn't support per-block tiling or 4:4:4 "individual" mode colors. It transcodes to standard ETC1 blocks. It gets 1.1 bits/texel, Luma SSIM .8616 using the same selector/endpoint codebook sizes. This mode has sometimes noticeable 4x4 block artifacts, but otherwise can perform extremely well in a rate-distortion sense vs. full ETC1 mode.

ETC1S mode has better behavior in a rate-distortion sense vs. full ETC1 mode, so there's still a lot of work to do.

The encoder's backend hasn't been tuned hardly at all yet, there are some missing optimization steps (that crunch used), and the selector codebook compression stage needs to be redone. The ETC1 per-block "mode" bits are also pretty expensive right now, so I need to implement a form of explicit "mode bit" RDO that purposely biases the mode bits in the front end to achieve better compression.

Monday, January 16, 2017

ETC1 intermediate format progress

I've been working on support for the ETC1 format in Basis. The RDO front end class is done, and I'm writing the first ETC1 encoder+transcoder tonight. ETC1 is different enough from DXT1 that a redesign/rewrite is required.

Check out these histograms of the symbols the encoder sends to the transcoder (for kodim18).

Tuesday, January 3, 2017

The "Faster Zombies!" blog post

I'll never forget this post:

Gabe Newell himself wrote a lot of this post in front of me. From what I could tell, he seemed flabbergasted and annoyed that the team didn't immediately blog this info once we were solidly running faster in OpenGL vs. D3D. (Of course we should have blogged it ourselves! One of our missions as a team inside of Valve was to build a supportive community around our efforts.) From his perspective, it was big news that we were running faster on Linux vs. Windows. I personally suspect his social network didn't believe it was possible, and/or there was some deeper strategic business reason for blogging this info ASAP.

I stood behind Gabe and gave him all the data concerning GL vs. D3D performance while he typed the blog post in. I was (and still remain) extremely confident that our results were real. We conducted these tests as scientifically as we could, using two machines with the same hardware, configured in precisely the same way in the BIOS's, etc. NVidia and AMD were able to reproduce our results independently. Also, I could have easily made L4D2 on Linux GL run even faster vs. Windows, but we had other priorities like getting more Source1 games working, and helping Intel with their open source GL driver. From what I understand, Linux has some inherent advantages at the kernel level vs. Windows that impact driver performance.

Curiously and amusingly, another key developer on the team (who will remain unnamed as a professional courtesy) seemed to be terrified of being involved in this post when Gabe came over. When I stepped up to tell Gabe what to say, this developer looked totally relieved. I should have realized right there that I had a lot more influence that I suspected. Every time Gabe came over, it seemed like I was in the hot seat. Those were some challenging and amazing experiences!

A few weeks after this post went out, some very senior developers from Microsoft came by for a discreet visit. They loved our post, because it lit a fire underneath Microsoft's executives to get their act together and keep supporting Direct3D development. (Remember, at this point it was years since the last DirectX SDK release. The DirectX team was on life support.) Linux is obviously extremely influential.

It's perhaps hard to believe, but the Steam Linux effort made a significant impact inside of multiple corporations. It was a surprisingly influential project. Valve being deeply involved with Linux also gives the company a "worse case scenario" hedge vs. Microsoft. It's like a club held over MS's heads. They just need to keep spending the resources to keep their in-house Linux expertise in a healthy state.

I look back and now I see that I was totally not ready to handle the stress involved in this project. My ability and experience managing stress was very limited back then. I took on the lion's share of the graphics problems in the Source1 engine releases, I had to interact with the external driver teams and their managers/execs, help and present to external game developers about how to get good performance out of OpenGL, dodge patent attacks on my open source software, survive the rage of one corporation as I (naively!) helped their competitors implement support for a public GL extension, and also try to survive during the mass layoffs of 2013.

Unfortunately, these layoffs did impact the Linux team's morale and development efficiency. Even so, I did have a great time working at Valve overall. It was the most challenging and educational experience I've ever had. I highly doubt I could have gotten this kind of life changing experience at any other company.

I love ebikes

Here's me ebiking to work at Boss Fight Entertainment in McKinney, TX, on a custom ebike built from parts acquired from In this video, I was only running at around 40v 20a, and the rest of the power was coming from peddling (me!). I eventually pushed the controller to its limit (100v), and switched from A123 LiFe cells to the much more "touchy" LiPo chemistry.

Dallas turned out to be a wonderfully ebike friendly area. There are endless miles and miles of golf cart trails, bike/walking trails, and virtually unused streets all over the Dallas metroplex. Also, I was almost invisible in Dallas as an ebiker. (Which is both good, and bad.)

Pic of Matt Pritchard's shipped title award

Pic taken during a late night debugging session. (I have one just like it.) This thing is so well done and high quality.

Ensemble Studios (Microsoft) was such a classy outfit. No other company I've worked at treated its employees as well as Ensemble's did (including Valve).