Grep and Python

The natural question is why not just use grep?! But assuming you can’t… import re import sys file = open(sys.argv[2], “r”) for line in file: if re.search(sys.argv[1], line): print line, Things to note: search instead of match to find anywhere in string comma (,) after print removes carriage return (line will have one) argv includes …

Read more

Is there a Java equivalent of Python’s ‘enumerate’ function?

For collections that implement the List interface, you can call the listIterator() method to get a ListIterator. The iterator has (amongst others) two methods – nextIndex(), to get the index; and next(), to get the value (like other iterators). So a Java equivalent of the Python above might be: import java.util.ListIterator; import java.util.List; List<String> numbers …

Read more

Is there a memory efficient and fast way to load big JSON files?

There was a duplicate to this question that had a better answer. See https://stackoverflow.com/a/10382359/1623645, which suggests ijson. Update: I tried it out, and ijson is to JSON what SAX is to XML. For instance, you can do this: import ijson for prefix, the_type, value in ijson.parse(open(json_file_name)): print prefix, the_type, value where prefix is a dot-separated …

Read more

Validating a yaml document in python

Given that JSON and YAML are pretty similar beasts, you could make use of JSON-Schema to validate a sizable subset of YAML. Here’s a code snippet (you’ll need PyYAML and jsonschema installed): from jsonschema import validate import yaml schema = “”” type: object properties: testing: type: array items: enum: – this – is – a …

Read more

What’s the difference between “virtualenv” and “-m venv” in creating Virtual environments(Python)

venv is a package shipped directly with python 3. So you don’t need to pip install anything. virtualenv instead is an independent library available at https://virtualenv.pypa.io/en/stable/ and can be install with pip. They solve the same problem and work in a very similar manner. If you use python3 I suggest to avoid an “extra” dependency …

Read more

“:=” syntax and assignment expressions: what and why?

PEP 572 contains many of the details, especially for the first question. I’ll try to summarise/quote concisely arguably some of the most important parts of the PEP: Rationale Allowing this form of assignment within comprehensions, such as list comprehensions, and lambda functions where traditional assignments are forbidden. This can also facilitate interactive debugging without the …

Read more

Pandas column values to columns? [duplicate]

There are a few ways: using .pivot_table: >>> df.pivot_table(values=”val”, index=df.index, columns=”key”, aggfunc=”first”) key bar baz foo id 2 bananas apples oranges 3 kiwis NaN grapes using .pivot: >>> df.pivot(index=df.index, columns=”key”)[‘val’] key bar baz foo id 2 bananas apples oranges 3 kiwis NaN grapes using .groupby followed by .unstack: >>> df.reset_index().groupby([‘id’, ‘key’])[‘val’].aggregate(‘first’).unstack() key bar baz foo …

Read more