Wednesday, June 22, 2011

Ngrams with coroutines in Python

This is how I define ngrams with coroutines
def coroutine(func):
""" A decorator function that takes care
of starting a coroutine automatically on call """
def start(*args,**kwargs):
coro = func(*args,**kwargs)
coro.next()
return coro
return start
@coroutine
def ngrams(n, target):
""" A coroutine to generate ngrams.
Accepts one char at a time """
chars = collections.deque()
while True:
chars.append((yield))
if len(chars) == n:
target.send(chars)
chars.popleft()
view raw gistfile1.py hosted with ❤ by GitHub

I need to filter text before generating ngrams and also, I want to process ngrams (in this case count bigrams)
@coroutine
def filter_chars(accepted_chars,target):
""" A coroutine to filter out unaccepted chars.
Accepts one char at a time """
while True:
c = (yield)
if c.lower() in accepted_chars:
target.send(c.lower())
@coroutine
def counter(matrix):
""" A counter sink """
while True:
a, b = (yield)
matrix[pos[a]][pos[b]] += 1
view raw gistfile1.py hosted with ❤ by GitHub

I combine my coroutines together
counts = [[10 for i in xrange(k)] for i in xrange(k)]
bigrams = filter_chars(accepted_chars, ngrams(2, counter(counts)))
for c in open('big.txt').read().decode(enc): bigrams.send(c)
view raw gistfile1.py hosted with ❤ by GitHub

Full source can be found in my fork of rrenaud's gibberish detector.

Thursday, June 9, 2011

stackverflow: the most upvoted Q/A tagged as Perl vs as Python

The most upvoted question tagged as Perl
The most upvoted question tagged as Python

Currently, for Python, it is the hidden gems in the language.
You just cannot upvote some of them enough.

On the other side, the most upvoted Perl question at this moment is about Unicode oddities. And the most upvoted answer will scare the shit out of you.

Wednesday, June 8, 2011

Start programming, learn Python


Recently I've been asked to send a few links to absolute beginners who want to start programming. As a language choice, I always recommend to start with Python. So I ended up to collect a list useful links for Python beginners:


Tutorials


Ask for help


Official tutorials 


Read some code
Students need to read real code but it is hard to find production code that is readable for novices.

Practice, write code