Functional Programming and Django QuerySets

LA Django Meetup

July 5th, 2016

tiny.cc/jtaqs4

@johntellsall

Note

Senior dev/server guy; DevOps
20 years experience with Python
john@johntellsall.com
first PyCon I went to had 40 people!

Ideas

iterator/generator = "stream"

Functional Programming: programming with composition

QuerySet is a stream

Note

< https://en.wikipedia.org/wiki/Function_composition_%28computer_science%29 >`

combine simple functions to build more complicated ones

Iterator review

An iterator is a stream of data—sort of a restricted, compact list or cursor.

>>> list([1,2])
[1, 2]

>>> iter([1,2])
<listiterator object at 0x7f429d83c750>

Note

mdash: http://docutils.sourceforge.net/FAQ.html

iterators have a item and next and that's it - Preferred, because they take almost no space

You already use iterators

iterate across a stream of strings

>>> f = open('recipe.ini')
>>> for line in f:
    print line

# very tasty
[Old Fashioned]
1:1.5 oz whiskey
2:1 tsp water
3:0.5 tsp sugar
4:2 dash bitters

Note

you already use iterators

Ex: Database iterator

Lists/Iterators are very similar

for line in open('ing.txt'):
    print line

for num in iter([2,4,6,8]):
    print num

for num in [2,4,6,8]:
    print num

for name in glob.iglob('*.txt'):
    print name

Why use Iterators over Lists?

"tools that use iterators are more memory efficient (and often faster) than their list based counterparts."

What can you do with a iterator?

>>> f = open('ing.txt')
>>> f.next()
'# Old Fashioned\n'
>>> f.next()
'1.5 oz whiskey\n'

What happens at the end?

>>> f = open('/dev/null')
>>> f.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

>>> iter([]).next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

What can you not do with an iterator?

no slicing

>>> f = open('ing.txt')
>>> f[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'file' object has no attribute '__getitem__'

What can you not do with an iterator?

no length

>>> f = open('ing.txt')
>>> len(f)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'file' has no len()

List vs Iterator

feature	list	iterator
overall	eager	lazy
memory	high	low
len(x)	yes	no
slice	x[:3]	islice(x, 3)
addition	x + y	chain(x, y)
has items	if x	no
easy debug	yes	no

Note: Python 2/3 are quite different

Note

List are "eager" -- know everything about them all the time

Million item list can be rough, because they hold all million - have to deal with all items

Million item iter is no biggie, can proc a few

Common iterator functions

enumerate(iter)
sorted(iter)
range(stop)
dict.iteritems()

very important:

filter(func/None, iter)
map(func, *iterables)

and itertools, and fileinput

Note

tradeoff readability vs conciseness

Functional Programming

Practical Advantages to FP

Modularity

Composability!

Ease of debugging and testing

Parallelization

Buzzwordy!

What is Functional Programming

functions are first class (objects)
focus on list processing
no side effects
expressions over statements
“higher order” functions

functions that operate on functions

>>> def is_odd(num):
    return num % 2

>>> filter(is_odd, [1, 2, 3])
[1, 3]

programming paradigms

procedural

list of instructions

can modify caller's state

object oriented

object has state and functions to query/modify state

specialize by subclassing

FP vs Procedural programming

procedural: list of instructions

def upfile(inpath, outpath):
    with open(outpath, 'w') as outf:
        for line in open(inpath):
            outf.write( line.upper() )

upfile('ing.txt', '/dev/stdout')

Note

how can you test this?
run in parallel?

Note

[Many] Languages are procedural: programs are lists of instructions that tell the computer what to do with the program’s input.

FP vs Object Orientation

object oriented: Object has state and specific functions to query/modify state. Easy to specialize by subclassing.

class RWFile(list):
    def __init__(self, inpath):
        super(Upcase, self).__init__(open(path))
    def transform(self, line):
        return line
    def writelines(self, outpath):
        with open(outpath, 'w') as outf:
            for line in self:
                outf.write( self.transform(line) )

class UpFile(RWFile):
    def transform(self, line):
        return line.upper()

UpFile('recipe.ini').writelines('/dev/stdout')

Note

Object-oriented programs manipulate collections of objects. Objects have internal state and support methods that query or modify this internal state in some way. Smalltalk and Java are object-oriented languages. C++ and Python are languages that support object-oriented programming, but don’t force the use of object-oriented features. ["Object obsessive"]

Functional Programming

procedural

list of instructions

object oriented

object has state and functions to query/modify state specialize by subclassing

functional

functions operate on streams of objects

combine simple functions => complicated

preferably without internal state

UpFile example in Functional Programming

using a generator expression

open('out.txt', 'w').writelines(
    line.upper() for line in open('in.txt')
)

Note

seed, then transforms recombine elements, vs specialize

UpFile in FP: map

specialize with named function and map

def upcase(line):
    return line.upper()

open('out.txt', 'w').writelines(
    map(upcase, open('in.txt'))
)

map-filter

map(func, iter) -- transform items using a function

def square(num):
    return num ** 2

>>> map(square, [1,2,3])
[1, 4, 9]

Note

function applies a passed-in function to each item in an iterable object and returns a list containing all the function call results.

map-filter

filter(func, iter) -- provide items matching a function

def is_odd(num):
    return num % 2

>>> filter(is_odd, [1, 2, 3])
[1, 3]

Note

The filter filters out items based on a test function which is a filter and apply functions to pairs of item and running result which is reduce.

Functional Programming examples

Example: Windows INI-file parser; aka ConfigParser

stream of lines
stream of valid lines (no comments, has key-value)
stream of key-value match objects
dictionary
TBD: dict of dictionaries

parse1.py

# 1. stream of lines
import fileinput
lines = fileinput.input()
print ''.join( lines )

# very tasty
[Old Fashioned]
1:1.5 oz whiskey
2:1 tsp water
3:0.5 tsp sugar
4:2 dash bitters

parse2.py

# 2. stream of valid lines
import fileinput
from itertools import *
def has_comment(line):
    return line.startswith('#')
def has_keyvalue(line):
    return ':' in line
lines = ifilterfalse( has_comment, fileinput.input() )
lines = ifilter( has_keyvalue, lines )
print ''.join( lines )

1.5 oz whiskey
1 tsp water
0.5 tsp sugar
2 dash bitters

parse3.py

# 3. stream of key-value match objects
import fileinput, re
from itertools import *
def has_comment(line):
    return line.startswith('#')
def parse_keyvalue(line):
    m = re.match(r'(\S+):(.+)', line)
    if m:
        return m.groups()
    return None
matches = (parse_keyvalue(line) for line in fileinput.input())
keyvalues = ifilter(None, matches)
print '\n'.join( (str(kv) for kv in keyvalues) )

('1', '1.5 oz whiskey')
('2', '1 tsp water')
('3', '0.5 tsp sugar')
('4', '2 dash bitters')

parse4.py

# 4. dictionary
import fileinput, re
from itertools import *
def has_comment(line):
    return line.startswith('#')
def parse_keyvalue(line):
    m = re.match(r'(\S+):(.+)', line)
    if m:
        return m.groups()
    return None
lines = ifilterfalse(has_comment, fileinput.input())
matches = (parse_keyvalue(line) for line in lines)
keyvalues = ifilter(None, matches)
confdict = dict(keyvalues)
print confdict

{'1': '1.5 oz whiskey', '3': '0.5 tsp sugar', '2': '1 tsp water', '4': '2 dash bitters'}

itertools

chain()
compress()
count()
cycle()
dropwhile()
groupby()
ifilter()
ifilterfalse()

imap()
islice()
izip()
izip_longest()
repeat()
starmap()
takewhile()
tee()

chain -- only for iterators

chain(iter*) gives elements of each stream in order Equivalent to + for lists.

>>> [1,2] + [3]
[1, 2, 3]

>>> from itertools import *
>>> chain(iter([1,2]), iter([3]))
<itertools.chain object at 0x7f429d848510>
>>> list(_)
[1, 2, 3]

Note

stream of objects with state lazy vs eager

islice -- similar to list

islice(iter, num) -- return first few items

>>> list([1, 2, 3])[:2]
[1,2]

>>> iter([1, 2, 3])[:2]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'listiterator' object has no attribute '__getitem__'

>>> itertools.islice(iter([1, 2, 3]), 2)
<itertools.islice object at 0x7f429d7de9f0>
>>> list(_)
[1, 2]

Note

https://docs.python.org/dev/howto/functional.html

Toolz example

>>> def stem(word):
...     """ Stem word to primitive form """
...     return word.lower().rstrip(",.!:;'-\"").lstrip("'\"")

>>> from toolz import compose, frequencies, partial
>>> from toolz.curried import map
>>> wordcount = compose(frequencies, map(stem), str.split)

>>> sentence = "This cat jumped over this other cat!"
>>> wordcount(sentence)
{'this': 2, 'cat': 2, 'jumped': 1, 'over': 1, 'other': 1}

Functional Programming & Python resources

short functional library and related FP talk - https://github.com/kachayev/fn.py http://kachayev.github.io/talks/uapycon2012/index.html#/27
tons of "awesome" links, including to FP in JS https://github.com/xgrommx/awesome-functional-programming
practical examples, more details https://newcircle.com/bookshelf/python_fundamentals_tutorial/functional_programming
good intro https://maryrosecook.com/blog/post/a-practical-introduction-to-functional-programming
Toolz libraries https://github.com/pytoolz/toolz
Packt book (free!) https://www.packtpub.com/application-development/functional-python-programming
O'Reilly book (free!) http://www.oreilly.com/programming/free/functional-programming-python.csp

Django QuerySets

represents a stream of rows from the database

Note

models.py

source: http://blog.etianen.com/blog/2013/06/08/django-querysets/

QuerySets are Django's way of getting and updating data

>>> from django.db import models
class Meeting(models.Model):
name = models.CharField(max_length=100)
meet_date = models.DateTimeField()

QuerySet review

>>> m = Meeting.objects.get(id=12)
<Meeting: Meeting object>

>>> vars( Meeting.objects.get(id=12) )
{'meet_date': datetime.datetime(2016, 7, 5, 7, 0, tzinfo=<UTC>),
'_state': <django.db.models.base.ModelState object at 0x2bd1050>,
'id': 3, 'name': u'LA Django Monthly Meeting'}

>>> x = Meeting.objects.filter(name__icontains='go')
>>> for a in x: print a.name
LA Django Monthly Meeting

QuerySet and iterators

QuerySets can be shifty

>>> x = Meeting.objects.filter(name='java')
>>> x
[]
>>> type(x)
<class 'django.db.models.query.QuerySet'>

Functional QuerySets

How can you tell if a list is empty or not?

an iterator?

a QuerySet?

Empty List?

Note

How can you tell if a list is empty or not?

A: Empty List

>>> bool([])
False
>>> bool(['beer'])
True

Note

Lists are eager -- always know everything

Empty Iterator?

Note

How can you tell if an iterator is empty or not?

A: Empty Iterator

>>> x=iter([1,2])
>>> bool(x)
True
>>> x=iter([])
>>> bool(x)
True

Note

Iterators are lazy -- don't know what they contain!

How can you tell if a QuerySet is empty or not?

QuerySet like Iterator

filter with QuerySet:

>>> from meetup.models import *
>>> Meeting.objects.filter(id=1)
[<Meeting: Meeting object>]

filter with list:

>>> filter(lambda d: d['id']==1, [{'id':1}, {'id':2}])
[{'id': 1}]

filter with iterator:

>>> list(ifilter(lambda d: d['id']==1, iter([{'id':1}, {'id':2}])))
[{'id': 1}]

Because QuerySet is an iterator

>>> from meetup.models import *
>>> Meeting.objects.filter(id=1)
[<Meeting: Meeting object>]

>>> type(Meeting.objects.filter(id=1))
<class 'django.db.models.query.QuerySet'>

Note

similar to iter: dynamic/lazy; list(qs)

diff: stream of objs, same class qs[:3] <=> islice(it, 3) bool(iter) vs qs.empty()

>>> a=iter([])
>>> bool(a)
True

>>> a=[] ; bool(a)
False

qs.count()

laziness is explicit: prefetch_related

qs.values(); qs.values_list(); qs.values-list(flat=True)

Can mix/match QS/iterators...

>>> Meeting.objects.all()[0].id
1

>>> islice( Meeting.objects.all(), 1).next().id
1

>>> from itertools import *
>>> islice( Meeting.objects.all(), 1)
<itertools.islice object at 0x2bb9ec0>

>>> list(islice( Meeting.objects.all(), 1))
[<Meeting: Meeting object>]

...but not always

How can you tell if a QuerySet is empty or not?

Answer

Use x.exists(), not bool(x) -- more efficient

Note

https://docs.djangoproject.com/en/1.9/ref/models/querysets/#exists

Both iterators and QuerySets are lazy

In functional programming, we have functions which operate on infinite-length streams.

With QuerySets, it's assumed we have many thousands of results, but we don't want to fetch all of them at once before returning to caller.

Database (and Django) does a query, then gives us a few items. Once that batch is done, QuerySet will ask the database for another batch of results.

This means that for both iterators and query sets, we can do a little work, then process a batch, without waiting for the entire list of results.

Proof: str(queryset.query) = SQL

>>> p=SourceLine.objects.filter(project='redis')

>>> str(p.query)

'SELECT "app_sourceline"."id", ... FROM "app_sourceline"
WHERE "app_sourceline"."project" = redis'

doesn't work for exists()!

>>> p.exists()
True

>>> str(p.exists().query)
AttributeError: 'bool' object has no attribute 'query'

Django SQL history

>>> p=SourceLine.objects.filter(project='redis').exists()
123

>>> from django.db import connection ; connection.queries[-1]

u'QUERY = u\'SELECT (1) AS "a" FROM "app_sourceline"
WHERE "app_sourceline"."project" = %s LIMIT 1\'
- PARAMS = (u\'redis',)'

Ideas

iterator/generator = "stream"

FP: functions operate on streams of immutable objects

QuerySet is a stream

Note

programming with composition

Questions?

john@johntellsall.com

Thanks

Elie Ceberio @eceberiotalener

Marcel Chastain @MarcelChastain @LADjango

Goz Inyama @notwitter

References

Can Your Programming Language Do This? by Joel Spolsky

http://www.joelonsoftware.com/items/2006/08/01.html

Wikipedia: Functional Programming

http://en.wikipedia.org/wiki/Functional_programming

Functional Programming HOWTO by Andy Kuchling

https://docs.python.org/2/howto/functional.html

Using Django querysets effectively by Dave Hall

(best blog title ever)

http://blog.etianen.com/blog/2013/06/08/django-querysets/

Functional Programming and Django QuerySets

@johntellsall

yes we're hiring

Ideas

there will be code

Iterators

Iterator review

You already use iterators

Lists/Iterators are very similar

Why use Iterators over Lists?

What can you do with a iterator?

What happens at the end?

What can you not do with an iterator?

What can you not do with an iterator?

List vs Iterator

Common iterator functions

☃

Functional Programming

que?

Practical Advantages to FP

What is Functional Programming

programming paradigms

FP vs Procedural programming

FP vs Object Orientation

Functional Programming

food chain

UpFile example in Functional Programming

UpFile in FP: map

map-filter

map-filter

Functional Programming examples

parse1.py

parse2.py

parse3.py

parse4.py

itertools

chain -- only for iterators

islice -- similar to list

Toolz example

Functional Programming & Python resources

☃

Django QuerySets

QuerySet review

QuerySet and iterators

Functional QuerySets

Empty List?

A: Empty List

Empty Iterator?

A: Empty Iterator

How can you tell if a QuerySet is empty or not?

QuerySet like Iterator

Because QuerySet is an iterator

Can mix/match QS/iterators...

...but not always

Answer

Proof: str(queryset.query) = SQL

doesn't work for exists()!

Django SQL history

Ideas

Questions?

Thanks

References