One problem that isn't clear anywhere is that you need to enable the right modules so that PDFs are processed.
I was getting the following error:
"msg":" Error loading class 'solr.extraction.ExtractingRequestHandler'",
The extraction library was available but it wasn't being loaded properly.
There is a key comment in bin/solr.in.sh:
Settings here will override settings in existing env vars or in bin/solr.
This means that changes in this file will affect the solr binary. I tried environment variables and updating solr config files but those didn't help.