top of page

Unicode in PyPI

Mar 29, 2023

Why?



Malicious Actors Use Unicode Support in Python to Evade Detection (phylum.io)

Malicious PyPI Package Uses Unicode - Why? | Cyware Alerts - Hacker News


The recent detection of the onyxproxy package on PyPI is a perfect example of why it’s important to stay vigilant and up-to-date with new developments in cyber security. 


This is Katy Craig in San Diego, California.


The onyxproxy malicious package harvests and exfiltrates credentials and other sensitive data - something which should worry us all. What makes this particular threat even more concerning is its use of an obfuscation technique foreseen back in 2007 during a discussion about Python’s support for Unicode. Allowing Unicode means that attackers can hide their activities from plain sight by using special characters or symbols instead of traditional letters or numbers - making it harder for people and anti-malware software to detect them until it's too late! 


By encoding malicious keywords into Unicode characters, defenders may have difficulty understanding what the code does or how it works - making them less likely to identify exploits against vulnerabilities present in their system’s software or hardware components. 


For instance, there are five possible alternatives for writing the single letter n, and 19 for the letter s. The word __import__ (commonly used in programming) can be written in over one billion different alternatives, which can easily bypass any static pattern matching-based security scan.


The onyxproxy package was downloaded nearly 200 times before being detected and removed. Its viability as a malware delivery approach may catch the eyes of more sophisticated hackers, though. So be on the lookout.


This is Katy Craig. Stay safe out there.


bottom of page