Monday 21 December 2015

Running DTruss on OS X El Capitan



Dtruss in OS X, much alike strace in Linux, is a very useful tool providing the capability to look at what a program is doing (more precisely, how it interacts with the operating and file system) without having to debug it on the source code level. In a recent system update called El Capitan, however, the tool stopped operating with default settings due to the newly implemented System Integrity Protection mechanism. Trying to run dtruss now results in an error:
$ dtruss ls
dtrace: failed to initialize dtrace: DTrace requires additional privileges
$ sudo dtruss ls
dtrace: failed to execute ls: dtrace cannot control executables signed with restricted entitlement

Some solutions on the Internet suggest to fully or partially disable System Integrity Protection to allow dtruss to do its bidding. However, such dire measures are not really necessary in most cases, since all that is really needed is to let it know that the program you're running is signed with no "restricted entitlement", i.e. which is to be trusted and allowed to be "spied" on.

This short step-by-step guide shows how to run dtruss without meddling with SIP (and, thus, potentially compromising the computer's security).
We assume that we want to run dtruss on the standard command line utility ls. To run dtruss on any other application, just replace the corresponding lines with the path to your application.

1) Create a copy of the standard sudo utility in a non-protected directory and re-sign it with an ad-hoc signature. This is needed so that we could run our application with standard user privileges (dtruss itself requires administrator privileges).
$ sudo mkdir -p /usr/local/bin # the directory may already exist on your system
$ sudo cp /usr/bin/sudo /usr/local/bin/sudo
$ sudo codesign -fs- /usr/local/bin/sudo

2) Now, re-sign the application you want to trace with an ad-hoc signature. Note: it will not work if the application is located in one of the standard binary directories which are protected (/System, /bin, /sbin, /usr/bin). In that case you need to copy the application somewhere else.

$ cp /bin/ls /usr/local/bin # this is obligatory only if the binary is located in a protected directory, or if you want to preserve the signature on its original copy
$ sudo codesign -fs- /usr/local/bin/ls

3) Finally, you're good to run struss!
$ sudo dtruss -f /usr/local/bin/sudo -u `whoami` /usr/local/bin/ls
...
43024/0x14e0f8:  fchdir(0x4, 0x7FE2DC002600, 0x1000)         = 0 0
43024/0x14e0f8:  close_nocancel(0x4)         = 0 0
43024/0x14e0f8:  fstat64(0x1, 0x7FFF5DB92D88, 0x1000)         = 0 0
43024/0x14e0f8:  ioctl(0x1, 0x4004667A, 0x7FFF5DB92DCC)         = 0 0
43024/0x14e0f8:  write_nocancel(0x1, "AUTHORS\t\tChangeLog-2014\tREADME-alpha\tbootstrap.conf\tdoc\t\tm4\t\tthanks-gen\n@\004\b\0", 0x48)         = 72 0
43024/0x14e0f8:  write_nocancel(0x1, "BUGS\t\tMakefile.am\tTHANKS.in\tbuild-aux\tgnulib\t\tpo\n\004\b\0", 0x31)         = 49 0
43024/0x14e0f8:  write_nocancel(0x1, "COPYING\t\tNEWS\t\tbasicdefs.h\tcfg.mk\t\tgnulib-tests\tsed\n@\004\b\0", 0x34)         = 52 0
43024/0x14e0f8:  write_nocancel(0x1, "COPYING.DOC\tREADME\t\tbootstrap\tconfigure.ac\tlib\t\ttestsuite\n\b\0", 0x3A)         = 58 0
...

Used sources:
http://stackoverflow.com/a/33776939/161934
https://sourceware.org/gdb/wiki/BuildingOnDarwin

P.S. With this approach the first few system calls will belong to sudo and not your app. Just skip them. Or try suspending the process at its start and attaching to it with dtruss (I'm still figuring out the best way to do that myself without meddling with the dynamic linker...).
P.P.S. As it may be seen, this post was written by someone not very familiar with OS X. I'll greatly appreciate any comments on potential mistakes and misconceptions encountered in it.
P.P.P.S In this article we signed the application with an ad-hoc signature. If you want to create your own signature and sign the application with it, do the following:
  • Run the Keychain Access application (⌘+Space => "Keychain Access" => Enter)
  • Pick "Keychain Access" => "Certificate assistance" => "Create a Certificate"
  • Set the type of the certificate to "Code" and give it an arbitrary name "your_certificate_name" which you will need to re-enter later. Your username should do.
  • Create the certificate. 
  • Then, re-run codesign as indicated above, replacing "-" with your own signature.

Monday 4 May 2015

How much memory will it take to store a single character string in Python 3?

Been investigating an unexpected bloat in one of my Python programs and stumbled upon this fairly surprising result. As I was storing a lot of short strings, I decided to see how much space a one character long string ('x') takes. As it turns out, a whole friggin' lot! Below I'm providing the structure used to store a string object in CPython, with all the macros expanded. Every string object gets a separate copy of this structure. See unicodeobject.h and object.h for details.

The data below pertains to the 64-bit interpreter version, on the 32-bit version the total size should be smaller.

typedef struct {
    /* _PyObject_HEAD_EXTRA */
    /* me: All objects are linked in a list. */
    struct _object *_ob_next;
    struct _object *_ob_prev;
    /* PyObject_HEAD */
    /* me: Py_ssize_t is 4 bytes long. */
    Py_ssize_t ob_refcnt
    struct _typeobject *ob_type;
    Py_ssize_t length;          /* Number of code points in the string */
    /* me: Py_hash_t is 4 bytes long. */
    Py_hash_t hash;             /* Hash value; -1 if not set */
    struct {
        unsigned int interned:2;
        unsigned int kind:3;
        unsigned int compact:1;
        unsigned int ascii:1;
        unsigned int ready:1;
        unsigned int :24;
    } state;
    wchar_t *wstr;              /* wchar_t representation (null-terminated) */
} PyASCIIObject;

Total = sizeof(PyAscIIObject) + strlen('x') = 48 + 2 = 50. The same number can be obtained from within Python by using the standard sys.getsizeof() function:
>>> sys.getsizeof('x')
50

50 bytes to store a single character string. Yikes!

P.S. One seemingly obvious optimization would be to lay objects in memory in a way that logically adjacent objects are close to each other in memory, so instead of using 8-byte pointers 2/4-byte offsets would suffice. (I'm a newbie though, so I have absolutely no clue how feasible that would be in CPython.)

Friday 6 February 2015

Making custom versions of packages in Ubuntu/Debian

Had to make some changes in the source code of a couple of packages on my system and decided to document the process for future reference. Surprisingly, it's way easier than I thought it would be.

As an example, suppose we want to make the default graphical viewer Eye of Gnome not jump back to the first picture after the last one in a directory (and, similarly, not jump in the opposite direction, either).

First, we need to get the source code for the program
$ mkdir eog; cd eog; apt-get source eog

Then let's install all the needed dependencies to compile the package:
$ sudo apt-get install build-dep eog

All right, time to make the changes we need. In this case, we need to slightly modify the corresponding part in eog_thumb_view_select_single() in src/eog-thumb-view.c to look like this:

        case EOG_THUMB_VIEW_SELECT_LEFT:
            gtk_tree_path_prev (path);
            break;
        case EOG_THUMB_VIEW_SELECT_RIGHT:
            if (gtk_tree_path_get_indices (path) [0] != n_items - 1)
                gtk_tree_path_next (path);
            break;

Now that we've made the changes, let's make our custom package:
$ dpkg-buildpackage -b

The program has been compiled and assembled in one neat package. All that is left is to install it, replacing the standard version:
$ sudo debi

Hooray, we have our custom version of Eye of Gnome installed and operational!

Now if I only knew a way to apply patches and rebuild the packages automatically every time they get an update..