Squinting at Python Objects



──────────────────
PyOhio
2011 July 30th
Brandon Craig Rhodes
──────────────────

slide

  1. Cyclic garbage collection
  2. Property attributes
  3. “Squinting” with squint

slide

Some instrumentation

slide

>>> # A base class that remembers its objects
>>>
>>> import weakref
>>> people = weakref.WeakKeyDictionary()
>>>
>>> class Person(object):
...     def __init__(self):
...         people[self] = None
...
>>> starbuck = Person()
>>> len(people)
1
>>> del starbuck
>>> len(people)
0

slide

>>> for i in range(400):
...     ahab = Person()
...     starbuck = Person()
...     ahab.firstmate = starbuck

At the bottom of each loop:

len(people) == 2

slide

     ahab       starbuck
                  
┌───────────┐ ┌──────────┐
Person      Person    
│“firstmate”→           
  refs==1    refs==2  
└───────────┘ └──────────┘

slide

>>> for i in range(400):
...     ahab = Person()
...     starbuck = Person()
...     ahab.firstmate = starbuck
...     starbuck.captain = ahab

Here len(people) varies.

slide

cyclic-gc.png
len(people) > 2     What's going on?

slide

     ahab       starbuck
                  
┌───────────┐ ┌──────────┐
Person      Person    
│“firstmate”→ ←“captain 
  refs==2    refs==2  
└───────────┘ └──────────┘

slide

┌───────────┐ ┌──────────┐
Person      Person    
│“firstmate”→ ←“captain 
  refs==1    refs==1  
└───────────┘ └──────────┘

slide

cyclic-gc.png
“cyclic garbage collection” (Python≥2.0)

Lesson:

Cyclic garbage collection
only cleans up cycles
every so often

Issue #2

Property attributes

slide

A property creates an attribute
that invokes code on the object
>>> class Captain(Person):
...     _firstmate = None
...
...     @property
...     def firstmate(self):
...         if self._firstmate is None:
...             self._firstmate = Person()
...             self._firstmate.captain = self
...         return self._firstmate

slide

Properties surprise you by creating
new objects when you look at them
>>> c = Captain()
>>> dir(c)
[..., 'firstmate']
>>> c.firstmate
<...Person object at 0x7f4b9db490c0>
If you are not careful, you might
think .firstmate was there the whole time

slide

The truth about an object
is in its __dict__ dictionary
>>> c = Captain()
>>> c.__dict__
{}
>>> c.firstmate
<...Person object at 0x7f4b9db490b0>
>>> c.__dict__
{'_firstmate': <...Person object at 0x7f4b9db490a0>}
We can see there was no _firstmate
until we induced its creation

slide

“Observer effect” in physics

When the mere act of measurement
causes a change in the value being measured.

slide

So, which actions on a Python
object are safe actions?
==
How can we safely “squint”
at an object without changing it?

slide

Note that print is not safe!

Nothing is safe that might call
__str__() or __repr__()

Safe operators

id(x)
type(x)
isinstance(x, )

slide

BTW: it is to make these safe
that Python implements them as plain
functions instead of attributes!
id(x)
type(x)
isinstance(x, )
See Armin Ronacher's
wonderful “Python and the
Principle of Least Astonishment”

slide

Safe-by-convention:

x.__class__
x.__class__.__module__
x.__class__.__name__
x.__dict__
x.__dict__['attribute']

What about built-in containers?

If isinstance(x, dict)
then can we safely ask for its keys?

No!

>>> class MyDict(dict):
...     def keys(self):
...         print 'SIDE EFFECT!'
...
>>> x = MyDict(a=1, b=2)
>>> isinstance(x, dict)
True
>>> x.keys()
SIDE EFFECT!

slide

Q: How do we avoid the side effect?

A: Calling dict.keys() manually instead
of letting the x object dispatch x.keys()!
>>> class MyDict(dict):
...     def keys(self):
...         print 'SIDE EFFECT!'
...
>>> x = MyDict(a=1, b=2)
>>> dict.keys(x)
['a', 'b']

Safe container methods

list.__len__(x)
list.__getitem__(x, index)
set.__iter__(x)
dict.items(x)
dict.keys(x)
dict.values(x)

slide

These are even safe even if
you accidentally try using them
against non-container objects!
>>> set.__iter__(Person())
Traceback (most recent call last):
  ...
TypeError: descriptor '__iter__' requires
 a 'set' object but received a 'Person'

Note that getitem is NOT safe!

dict.__getitem__(x, key)
It invokes every key's
__eq__() method!
In general,
x is y is safe
x == y is not

Traditional exploration is painful

One mistake at the pdb prompt
and you have to go back to square one

slide

(Pdb) id(a)
(Pdb) type(a)
(Pdb) a.__dict__
(Pdb) id(a.__dict__['firstmate'])
(Pdb) type(a.__dict__['firstmate'])
(Pdb) a.__dict__['firstmate'].__dict__
(Pdb) a.__dict__['firstmate'].name
Aargh!
I was not supposed to say .name
Have to start over again

Solution

The squint package.

Invocation

>>> import squint
>>> s = squint.at(x)

or

def myfunction():
    
    import squint; squint.pdb()
    

slide

Here are two sample objects
on which to try our newfound powers:
>>> ahab = Person()
>>> ahab.name = 'Ahab'
>>> ahab.age = 60
>>> ahab.features = {'religion': 'Puritan'}
>>>
>>> starbuck = Person()
>>> starbuck.name = 'Starbuck'
>>> starbuck.traits = {'loyal'}
>>>
>>> ahab.firstmate = starbuck
>>> starbuck.captain = ahab

Primitives

>>> s = squint.at(ahab)
This s object wraps ahab and promises
to only use “safe” operations
>>> s.id == id(ahab)
True
>>> s.typename
'__main__.Person'
>>> s.type       # dangerous?
<class '__main__.Person'>

Squint's repr() and .verbose

Where all the fun is waiting:

>>> s = squint.at(ahab)
>>> s.verbose
<__main__.Person 0x7f4b9db490d0>
  a_age <int 60>
  a_features <dict 0x7f4b9db490e0>
  a_firstmate <__main__.Person 0x7f4b9db490f0>
  a_name <str 'Ahab'>

slide

All operations on a squinter are safe
squint will never let you
accidentally invoke object code
>>> s.a_age
<int 60>
>>> s.a_name
<str 'Ahab'>

slide

>>> s.a_features.verbose
<dict 0x7f4b9db49100 len=1>
  k_religion <str 'Puritan'>
>>> s.a_features.k_religion
<str 'Puritan'>
>>> s.a_firstmate.verbose
<__main__.Person 0x7f4b9db49110>
  a_captain <__main__.Person 0x7f4b9db49120>
  a_name <str 'Starbuck'>
  a_traits <set 0x7f4b9db49130 len=1>

slide

Q: So what about non-verbose mode?
A: It omits simple sub-objects
since they cannot create cycles

slide

>>> s
<__main__.Person 0x7f4b9db49140>  int*1  str*4
  a_features <dict 0x7f4b9db49150 len=1>
  a_firstmate <__main__.Person 0x7f4b9db49160>
>>> s.a_features
<dict 0x7f4b9db49170 len=1>  str*7
>>> s.a_firstmate
<__main__.Person 0x7f4b9db49180>  str*8
  a_captain <__main__.Person 0x7f4b9db49190>
  a_traits <set 0x7f4b9db491a0 len=1>

squint's attribute dialect

slide

>>> squint.at(('zero', 1, 2)).verbose
<tuple 0x7f4b9db491b0 len=3>
  item0 <str 'zero'>
  item1 <int 1>
  item2 <int 2>

slide

>>> squint.at(['foo', 'bar', 'baz']).verbose
<list 0x7f4b9db491c0 len=3>
  item0 <str 'foo'>
  item1 <str 'bar'>
  item2 <str 'baz'>

slide

>>> squint.at({'egg', 'tgz', 'zip'}).verbose
<set 0x7f4b9db491d0 len=3>
  member0 <str 'tgz'>
  member1 <str 'egg'>
  member2 <str 'zip'>

slide

>>> squint.at({
...     'happy_key': 'foo',
...     15: 'bar',
...     'ugly$key': 'baz',
...     }).verbose
...
<dict 0x7f4b9db491e0 len=3>
      k15 <str 'bar'>
      k_happy_key <str 'foo'>
      key0 <str 'ugly$key'>
      value0 <str 'baz'>

slide

Hybrid objects can have
several kinds of squint attribute

slide

>>> class MyDict(dict):
...     def __init__(self):
...         self.myattr = 'myvalue'
...
>>> x = MyDict()
>>> x['happy_key'] = 'foo'
>>> x['ugly$key'] = 'baz'
>>>
>>> squint.at(x).verbose
<__main__.MyDict 0x7f4b9db491f0>
  a_myattr <str 'myvalue'>
  k_happy_key <str 'foo'>
  key0 <str 'ugly$key'>
  value0 <str 'baz'>

slide

Finally, what about getting squint
to do its own search for cycles?

slide

>>> s = squint.at(ahab)
>>> s.cycles()
_ <- .a_firstmate.a_captain
It found the cycle we saw earlier!
┌───────────┐ ┌──────────┐
Person      Person    
│“firstmate”→ ←“captain 
└───────────┘ └──────────┘

slide

It will even find cycles
that do not involve the main object
at which you are squinting
>>> melville = Person()
>>> melville.characters = [ahab, starbuck]
>>> squint.at(melville).cycles()
_.a_characters.item0 <- .a_firstmate.a_captain
_.a_characters.item1 <- .a_captain.a_firstmate

Conclusion

Thank you!

slide

>>> class Fancy(object):
...     a = 'classattr'
>>> class Fancy2(Fancy):
...     @property
...     def a(self):
...         return 'property'
...     def __getattr__(self, name):
...         if name == 'a':
...             return 'getattr'
...         raise AttributeError
...     def __getattribute__(self, name):
...         if name == '__dict__':
...             ga = object.__getattribute__
...             return ga(self, '__dict__')
...         if name == 'a':
...             return 'getattribute'
...         raise AttributeError

slide

>>> f = Fancy2()
>>> f.__dict__['a'] = 'instanceattr'
>>> f.a
'getattribute'
>>> del Fancy2.__getattribute__
>>> f.a
'property'
>>> del Fancy2.a
>>> f.a
'instanceattr'
>>> del f.__dict__['a']
>>> f.a
'classattr'
>>> del Fancy.a
>>> f.a
'getattr'