Conversation
About 2x faster at reading stat files and can parse stat files based on a list of score criteria. Support for statv1
benmwebb
left a comment
There was a problem hiding this comment.
Looks generally OK, and seems like the tests all pass now, so I'll be happy to merge this once you've made the changes I suggested. Oh, and run the file through tools/dev_tools/python_tools/reindent.py or autopep8. The indentation looks a little wonky to me in places.
|
|
||
| #Store these keys in a dictionary. Example pair: {109 :'Total_Score'} | ||
| self.dict = self.parse_line(line, header=True) | ||
| #self.dict = ast.literal_eval(line) |
There was a problem hiding this comment.
Remove, not comment out, old stuff. If we want to go back, that's what git is for.
| for k in kkeys: | ||
| self.inv_dict.update({self.dict[k]: k}) | ||
| else: | ||
| print("WARNING: statfile v1 is deprecated. Please convert to statfile v2") |
There was a problem hiding this comment.
Nothing wrong with your patch here, but when you update it, you should use the "proper" deprecation function, as in current develop.
|
|
||
|
|
||
| def parse_line(self, line, header=False): | ||
| # Parses a line and returns a dictionary of key:value pairs |
There was a problem hiding this comment.
Use a docstring ("foo") rather than comment (#foo) here so it gets picked up by doxygen. (If you don't want it to be part of the public interface - which requires documentation - make it a private function by calling it _parse_line rather than parse_line.)
| fd = split[i] | ||
| fields = fd.split(",") # split via commas to get key:value pair | ||
| for h in fields[0:-1]: | ||
| if h != "": # For some reason, there is occasionally an empty field. Ignoring these seems to work. |
There was a problem hiding this comment.
Don't duplicate all this code from above. Put it in a function.
| def get_keys(self): | ||
| """ Returns a list of the string keys that are included in this dictionary | ||
| """ | ||
| self.klist = [k[1] |
There was a problem hiding this comment.
Do this only once, not each time the function is called. Easiest would just be to setup klist when the file is first parsed.
| # otherwise, just return the string | ||
| try: | ||
| float(c) | ||
| except: |
|
|
||
|
|
||
| def does_line_pass_criteria(self, fields, c): | ||
| # Given a stat file line (as a dictionary) and a criteria tuple, decide whether |
|
|
||
|
|
||
| comparison = c[2] | ||
| if comparison not in ["==", "<", ">"]: |
There was a problem hiding this comment.
Generally considered better style to use () rather than [] here, since the set of items is immutable.
|
|
||
| comparison = c[2] | ||
| if comparison not in ["==", "<", ">"]: | ||
| raise Exception('ERROR: IMP.pmi.output.ProcessOutput.does_line_pass_criteria() - Comparison string must be \'>\', \'<\' or \'==\', instead of %s' % (comparison)) |
There was a problem hiding this comment.
Raise a more specific exception. ValueError is probably the most appropriate here.
| model_value = self._float_string(fields[intkey]) | ||
|
|
||
| if type(value) is not type(model_value): | ||
| raise Exception('ERROR: IMP.pmi.output.ProcessOutput.does_line_pass_criteria() - Comparison field %s is of type %s while you tried to compare it to a %s' % (key, type(model_value), type(value))) |
About 2x faster at reading stat files and can parse stat files based on a list of score criteria.
Support for statv1