Parse.java:
Added comments
JavaParser.java:
Updated the genSymbols method and a private class 'NodeVisitor' which
implements ASTVisitor. genSymbols returns an instance of the
Symbols class containing all relevant data about the Java code.
JavaSymbols.java:
Add fields which map class, interface, method, field, and
variable names to positions.
Add:
c.py
- CTreeCutter class is very similar to PyTreeCutter. It utilizes
self.cache as opposed to PyTreeCutter which doesn't yet.
- CTreeCutter visit functions simply add start and end lines of
the node to the cache, and visit_Decl pushes the cache onto
accum.
- parse_c performs a task identical to parse_py. However, many
c files need to be pre-processed before they are parsed.
Add:
bitshift/crawler/crawler.py
-Add more efficient method of querying GitHub's API for stargazer
counts, by batching 25 repositories per request.
-Add watcher counts for Bitbucket repositories, by querying the
Bitbucket API once per repository (inefficient, but the API in question
isn't sufficiently robust to accommodate a better approach, and Git
repositories surface so infrequently that there shouldn't be any query
limit problems).
Several of the closed issues were addressed partly in previous commits;
definitively close them with this, for the moment, final update to the crawler
package.
Ref:
bitshift/crawler/indexer.py
-move all `GitIndexer` specific functions (eg, `_decode`,
`_is_ascii()`)from the global scope to the class definition.
Add:
bitshift/
__init__.py
-add `_configure_logging()`, which sets up a more robust logging
infrastructure than was previously used: log files are rotated once
per hour, and have some additional formatting rules.
(crawler, indexer).py
-add hierarchically-descending loggers to individual threaded
classes (`GitHubCrawler`, `GitIndexer`, etc.); add logging calls.
indexer.py
-remove file filtering regex matches from `_get_tracked_files()`,
as non-code files will be discarded by the parsers.
Add:
bitshift/crawler/crawler.py
-add `_get_repo_stars()` to `GitHubCrawler`, which queries the GitHub
API for the number of a stars that a given repository has.
-log the `next_api_url` every time it's generated by `GitHubCrawler` and
`BitbucketCrawler` to two respective log-files.