diff --git a/_layouts/post.html b/_layouts/post.html index 3749bb1..117975a 100644 --- a/_layouts/post.html +++ b/_layouts/post.html @@ -7,7 +7,7 @@ layout: base
- Date: {{ page.date | date: "%b %d, %Y" }} + Date: {{ page.date | date: "%a, %b %d, %Y" }}
{{ content }} diff --git a/_posts/2012-01-29-earwigbot-and-toolserver-updates.md b/_posts/2012-01-29-earwigbot-and-toolserver-updates.md new file mode 100644 index 0000000..bd20641 --- /dev/null +++ b/_posts/2012-01-29-earwigbot-and-toolserver-updates.md @@ -0,0 +1,90 @@ +--- +layout: post +title: EarwigBot and Toolserver Updates +description: More progress on EarwigBot and the Toolserver site rewrite, including dynamic backgrounds. +--- + +Haven't really said much in a while, so I felt it appropriate to make a new +blog post. No, I'm not dead, and yes, I _am_ busy working on my Wikipedia +responsibilities, including EarwigBot and my Toolserver site (which I'll go +into detail about in a bit). Progress is not as quick now as it was over the +summer, and you can blame school for the delay in getting things done. My +primary to-do list for Wikipedia right now looks like this: + +1. Finish copyvio detection in the new EarwigBot +2. Integrate EarwigBot's copyvio detection with the new Toolserver site + +\#1 has a good portion of its work done, but I still have to finish the actual +detection. I've isolated the work down to a few methods of +`earwigbot.wiki.copyright.CopyrightMixin`: `_copyvio_strip_html()`, to +extract the "text" (i.e. content inside <p> tags) from an HTML document +(I'll probably use something like +[Beautiful Soup](http://www.crummy.com/software/BeautifulSoup/) for this); +`_copyvio_strip_article()`, to extract the "text" from an article (that is, +stripping templates, quotes, references); and `_copyvio_chunk_article()`, to +divide a stripped article into a list of web-searchable queries. Everything +else, including the Task class for `afc_copyvios` is done. + +\#2 is very simple once #1 is done. I've already written the code to load +EarwigBot's wiki toolset from `copyvios.mako`, and the config file is written, +so running the detector is trivial once it works. The only thing left here is +to have the tool produce relatively eye-pleasing output, perhaps with a +"details" section showing the Markov chains formed from the two sources and +comparing them visually. Not necessary at all, but a nice touch. + +Unfortunately, there's still a bit more work to do on EarwigBot before he's +ready for his first release (0.1!). Aside from the copyvio stuff above, which +is integrated directly as a function of `Page`, I want to finish porting over +the remaining tasks from old EarwigBot that are still running via cron, improve +the Wiki Toolset such that new sites can be added programmatically, and improve +config such that it can be created by the bot and not only by hand. This is the +main barrier stopping other people from running EarwigBot, and thus the primary +concern before v0.1 is good. Of course, none of this urgent; getting copyvio +detection finished is my primary concern. + +## Dynamic Backgrounds + +Now that that's covered, let's look at something (mostly unrelated) I finished +a couple days ago: dynamic backgrounds for +[my new toolserver site](http://toolserver.org/~earwig/rewrite)! You can see it +in action a bit better on [this page](http://toolserver.org/~earwig/earwigbot). +The background is the the [Wikimedia Commons](http://commons.wikimedia.org/) +[Picture of the Day](http://commons.wikimedia.org/wiki/Commons:Picture_of_the_day), +loaded and displayed with JavaScript. +[Here's the code for it](https://github.com/earwig/toolserver/blob/master/static/js/potd.js), +a good deal more code than I had expected to write. + +Here's what we have to do: + +1. Query Commons' API for the content of the template \{\{Potd/YYY-MM-DD}}. +2. Parse that content for the filename of the image, which will be hidden in + something like \{\{Potd filename|1=Foo.png}}. +3. Query Commons' API again for Foo.png's URL and dimensions. +4. Since we want the image to "cover" the background (that is, be the smallest + size possible while leaving none of the background color visible), we need + to calculate the image's aspect ratio and our own aspect ratio, then + determine the width of the thumbnail we want. If the image is shorter than + our screen, the necessary width is our screen's width, but if the image is + longer than our screen, the necessary width is our screen's height + multiplied by the image's aspect ratio. +5. If the width of our desired thumbnail is less than the width of the image, + we'll alter the image's URL (insert a `/thumb/` in the middle somewhere and + add a resolution at the end) and set that as our `body`'s `background-image` + property. This is better than loading the full image and downscaling it, + because less bandwidth is required. +6. If the width of our desired thumbnail is _greater_ than (or equal to) the + width of the image, we load the full image and upscale it (gasp! horror!) + using the CSS bit `background-size: cover;`. + +All of the images I tested looked decent when displayed under this method, some +better than others, but all acceptable. I figured this code provided a nice +touch to an otherwise drab webpage (like the one you're viewing now, it +wouldn't have been very pretty), which is why I did it, but I couldn't help but +wonder if there was an... easier... method that still saved bandwidth and +didn't resort to ugly scaling/cropping/repeating/whatever, but I could come up +with nothing. It was a fun project in a language I almost never use, though, so +worth it in the end. + +That's all for now! + +:—earwig diff --git a/css/main.css b/css/main.css index 7246657..45f0850 100644 --- a/css/main.css +++ b/css/main.css @@ -28,9 +28,6 @@ body { .index-header { padding-top: 15px; -} - -h1, h2 { text-align: center; } @@ -38,6 +35,11 @@ pre { white-space: pre-wrap; } +code { + background: #f2f2f2; + border: 1px solid #e8e8e8; +} + div.project { border: 1px solid #CCC; border-radius: 5px; @@ -105,6 +107,7 @@ h1#head { padding-top: 0px; margin-bottom: 0px; padding-bottom: 0px; + text-align: center; } div#container {