Pandas to_csv Encoding Error Solution

As these things typically go, last week I ran into an unusual error when using DataFrame.to_csv:

/usr/local/lib/python3.6/dist-packages/pandas/io/formats/csvs.py in _save_chunk(self, start_i, end_i)
354 )
355
--> 356 libwriters.write_csv_rows(self.data, ix, self.nlevels, self.cols, self.writer)

pandas/_libs/writers.pyx in pandas._libs.writers.write_csv_rows()

UnicodeEncodeError: 'utf-8' codec can't encode characters in position 31-32: surrogates not allowed

The error was unusual to me because I was using Pandas in a way I typically would, on data that should not have been meaningfully different in type from the data sets I’ve used it on. This was a real head-scratcher that no number of Stack Overflow answers, Github comments or blog posts seemed to offer a good answer to.

With a lot of trial and error, it appeared the raw data itself was the problem, not any weird side effect of re.sub or other munging operations I was doing. In short, I needed to clean up the encodings for every field in the entire DataFrame. Here’s the solution, if you’re in the same boat:

new_df = original_df.applymap(lambda x: str(x).encode("utf-8", errors="ignore").decode("utf-8", errors="ignore"))

I entirely expect this approach is imperfect and non-optimal, but it works. I’d be happy to hear suggestions.

 

Relevant reading:

  1. pandas.DataFrame.applymap
  2. String encode()
  3. String decode()
  4. Python standard encodings

How I Created A Robot Researcher With Zapier, Evernote and DuckDuckGo

An idea I’ve been noodling for quite some time (going back several years to my junior year at TCU) is a tool that automatically researches topics in the background for you. One such tool existed for a brief period of time, Dunno, but the company now appears defunct.

Earlier today, I decided to take a stab at setting up a complex multi-step zap in Zapier that would tie into Evernote to pull the subject and post the result. I figured DuckDuckGo would have some sort of API to access their instant answers and found that they do indeed. Zapier’s Code action allows you to run Python code, but doesn’t allow you to import additional libraries. To work around this, I found Mashape (listed on DuckDuckGo’s API page) fully sufficient.

The implementation details are below:

#1: Create a trigger to watch for new notes in a specific Evernote notebook.

#2: [Optional] Create a filter to only proceed with notes bearing a specific tag. You can skip this if you want the zap to run with any new note added to a certain notebook.

#3: Create a code action to fetch the DuckDuckGo results as JSON. I used Python. Sample code below.

query = input_data[“note”].replace(” “,”+”)

request_url = “https://duckduckgo-duckduckgo-zero-click-info.p.mashape.com/?format=json&no_html=1&no_redirect=1&q=” + query + “&skip_disambig=1”

response = requests.get(request_url,
headers={
“X-Mashape-Key”: “YOUR_KEY”,
“Accept”: “application/json”
}
)

output = response.json()

#4: [Optional] Create a code action to compile a link to launch a DuckDuckGo search for your topic. I create this mostly for convenience; if I wanted to dig further into a topic, I could easily click the generated link. Sample code below.

query = input_data[“note”].replace(” “,”+”)compiled_link = “https://duckduckgo.com/?q=” + query

output = [{‘search_link’: compiled_link}]

#5: Create an action to append the results to your note (or create a new note with the results).

 

I have a long list of topics I am idly interested in. I created this automated researcher to help feed those curiosities. If, after reading the results, the topic continues to pull on my mind I am free to devote my time to researching it deeper. If not, I can just file it away with the result included in case I ever want to look it up again.

I’m aiming to create a similar task using the Email Parser trigger to let me email topics to my automated researcher and have it save its results to a new Evernote note.

Transferring files to and from Pythonista

Pythonista is a fantastic iOS app—one that really pushes the boundaries of what iPhones and iPads are capable of doing. However, due to Apple’s restrictions, developer Ole Zorn can’t include any kind of file syncing for scripts within the app nor can he add Pythonista to the system share sheet as a destination for files.

There is a work around, though: copy and paste this script by Ole into Pythonista, then run it. The script creates an FTP server that is accessible over your local network. You can then use Panic’s excellent Transmit to transfer files to and from Pythonista on the same iOS device.