String substitution in Python

I recently had a set of SPSS syntax that iterated over multiple variables and generated many similar graphs. They were time series graphs of multiple variables, so the time stayed the same but the variable on the Y axis changed and the code also changed the titles of the graphs. Initially I was using the % string substitution with a (long) list of replacements. Here is a brief synonymous example.

*Creating some fake data.
MATRIX.
SAVE {UNIFORM(100,6)} /OUTFILE = * /VARIABLES = V1 TO V3 X1 TO X3.
END MATRIX.
DATASET NAME x.

*Crazy long string substitution.
BEGIN PROGRAM Python.
import spss

#my variable and label lists
var = ["V1","V2","V3"]
lab = ["Var 1","Var 2","Var 3"]

for v,l in zip(var,lab):
  spss.Submit("""
*Descriptive statistics.
FREQ %s.
CORRELATIONS /VARIABLES=X1 X2 X3 %s.
*Graph 1.
GRAPH /SCATTERPLOT(BIVAR)=X1 WITH %s /TITLE = "%s".
*Graph 2.
GRAPH /SCATTERPLOT(BIVAR)=X2 WITH %s /TITLE = "%s".
*Graph 3.
GRAPH /SCATTERPLOT(BIVAR)=X3 WITH %s /TITLE = "%s".
""" % (v,v,v,l,v,l,v,l))
END PROGRAM.

When you only have to substitute one or two things, "str %s and %s" % (one,two) is no big deal, but here is quite annoying having to keep track of the location of all the separate variables when the list grows. Also we are really just recycling the same object to be replaced multiple times. I thought this is python, so there must be an easier way, and sure enough there is! A simple alternative is to use the format modifier to a string object. Format can take a vector of arguments, so the prior example would be "str {0} and {1}".format(one,two). Instead of %s, you place brackets and the index position of the argument (and Python has zero based indices, so the first element is always 0).

Here is the SPSS syntax updated to use format for string substitution.

*This is much simpler using ".format" for substitution.
BEGIN PROGRAM Python.

var = ["V1","V2","V3"]
lab = ["Var 1","Var 2","Var 3"]

for v,l in zip(var,lab):
  spss.Submit("""
*Descriptive statistics.
FREQ {0}.
CORRELATIONS /VARIABLES=X1 X2 X3 {0}.
*Graph 1.
GRAPH /SCATTERPLOT(BIVAR)=X1 WITH {0} /TITLE = "{1}".
*Graph 2.
GRAPH /SCATTERPLOT(BIVAR)=X2 WITH {0} /TITLE = "{1}".
*Graph 3.
GRAPH /SCATTERPLOT(BIVAR)=X3 WITH {0} /TITLE = "{1}".
""".format(v,l))    
END PROGRAM.

Much simpler. You can use a dictionary with the % substitution to the same effect, but here the format modifier is a quite simple solution. Another option I might explore more in the future are using string templates, which seem a good candidate for long strings of SPSS code.

Advertisements
Leave a comment

4 Comments

  1. I much prefer to use the named substitution method, e.g.
    “””text %(abc)s and %(def)s””” % locals()
    While the format mechanism can be better than simple %s, using named substitutions generally gives more readable code IMO.

    -Jon Peck (WP seems to have picked my WP account here, which wasn’t what I intended.)

    Reply
    • That is fair (Jignesh shares the same opinion apparently). When staring at a hundred lines of GGRAPH code I like “{?}” as I think it stands out a bit more than “%()s”, which is the bigger problem than seeing what is exactly substituted, but it is not a big difference.

      Reply
  2. Jignesh Sutar

     /  February 20, 2015

    I tend to use the locals() variation of this as it makes it clear what is being substituted into the body of the string. The “s” on the right hand side of the parenthesis (in the code below) indicates to format the variable as a string. Alternatives could be “03d”, which produces 3 digit integer with 0’s left padded, and other such python formatting alternatives are available also.

    *Using locals() method.
    BEGIN PROGRAM Python.

    var = [“V1″,”V2″,”V3”]
    lab = [“Var 1″,”Var 2″,”Var 3”]

    for v,l in zip(var,lab):
    spss.Submit(“””
    *Descriptive statistics.
    FREQ %(v)s.
    CORRELATIONS /VARIABLES=X1 X2 X3 %(v)s.
    *Graph 1.
    GRAPH /SCATTERPLOT(BIVAR)=X1 WITH %(v)s /TITLE = “%(v)s”.
    *Graph 2.
    GRAPH /SCATTERPLOT(BIVAR)=X2 WITH %(v)s /TITLE = “%(v)s”.
    *Graph 3.
    GRAPH /SCATTERPLOT(BIVAR)=X3 WITH %(v)s /TITLE = “%(v)s”.
    “”” % locals ())
    END PROGRAM.

    Reply
  1. String substitution in Python continued | Andrew Wheeler

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: