Open Wiki Blog Planet

30 July, 2010

Gerard Meijssen

#Unicode on sorting French topic lists

A list extracted from the French #Wikipedia is used for "Unicode Technical Note #34". This document presents a case study of collation issues, using data from a French language topic list to illustrate alternative orders and how to obtain them. It also discusses implementation issues for ordering lists of this type.

Even though the document is quite interesting, I would have been thrilled when collation for a language like Hindi, Kannada or Burmese was chosen.
Thanks,
      GerardM

by noreply@blogger.com (GerardM) at 30 July, 2010 04:32 PM

Blog on Wiki Patterns

Redesigning the Airline Boarding Pass

Designer Tyler N. Thompson conducted an experiment to redesign airline boarding passes for finding information quickly in the crowded, hectic airport environment. Here’s his original boarding pass, and one of the reader-submitted redesigns:


Figure 1. Current Boarding Pass


Figure 2. Redesign by David Yoon

(Via Graphicology and Timoni Grone)

by Stewart Mader at 30 July, 2010 03:20 PM

The Un-Identity of e-Book Readers

Kevin Maney points out that, unlike books with their varied and descriptive covers, e-Reader devices conceal the identity of what you’re reading:

…the Kindle lets readers down with respect to one subtle but powerful element of the traditional book’s appeal: its role as an identity marker. Pulling out a particular book on an airline flight or in a doctor’s office can mean staking a claim to being a particular kind of person. Likewise, the books lining your living room or office can tell others about your interests and background. But on the Kindle, no matter what you’re reading, all anyone else will see is an unchanging plastic device.

(Via Clive Thompson)

by Stewart Mader at 30 July, 2010 02:21 PM

Gerard Meijssen

A story to tell

The Indonesia Wayang is a well known theatrical tradition. To remain relevant, it is important that it keeps its relevance. UNESCO designated Wayang Kulit a masterpiece of the oral and intangible heritage of humanity.

With a history that goes back many, many centuries Wayang kept its history and it maintained its relevance by including contemporary references and story lines. It may be not that hard to guess what kind of story is told with this Wayang puppet..


We just received this magnificent treasure from the Tropenmuseum and it shows how Wayang kept its relevancy. What makes the Tropenmuseum more then just a museum is that they give attention to the modern expressions. An expression that presented the theatre world of Ki Enthus Susmono earlier this year. A world where Batman, George Bush, Osama Bin Laden and the Tsunami have found their place.


When traditions like Wayang constantly renew themselves, the Wikipedia need to reflect this, Commons needs to illustrate this. I hope that we will gain the freely licensed pictures that show the vibrancy of this magnificent art.
Thanks,
      GerardM

by noreply@blogger.com (GerardM) at 30 July, 2010 09:44 AM

Pictures of the Day

Wikimedia Technical Blog

MediaWiki version statistics

Some kind people at Qualys have surveyed versions of open source web apps present on the web, including MediaWiki. Here is the relevant page from their presentation:

MediaWiki versions 2010-07-30

For the original see:

And the press release:

They make the point that 95% of MediaWiki installations have a “serious vulnerability”, whereas only 4% of WordPress installations do. While WordPress’s web-based upgrade utility certainly has a positive impact on security, I feel I should point out that what WordPress counts as a serious vulnerability does not align with MediaWiki’s definition of the same term.

For instance, if a web-based user could execute arbitrary PHP code on the server, compromising all data and user accounts, we would count that as the most serious sort of vulnerability, and we would do an immediate release to fix it. We’re proud of the fact that we haven’t had any such vulnerability in a stable release since 1.5.3 (December 2005).

However in WordPress, they count this as a feature, and all administrators can do it. Similarly, WordPress avoids the difficult problem of sanitising HTML and CSS while preserving a rich feature set by simply allowing all authors to post raw HTML.

If you are running MediaWiki in a CMS-like mode, with whitelist edit and account creation restricted, then I think it’s fair to say that in terms of security, you’re better off with MediaWiki 1.14.1 or later than you are with the latest version of WordPress.

However, the statistics presented by Qualys show that an alarming number of people are running versions of MediaWiki older than 1.14.1, which was the most recent fix for an XSS vulnerability exploitable without special privileges. There is certainly room for us to do better.

We have a new installer project in development, which we hope to release in 1.17. It includes a feature which encourages users to sign up for our release announcements mailing list. But maybe we need to do more. Should we take a leaf from WordPress’s book, and nag administrators with a prominent notice when they are not using the latest version? Such a feature would require MediaWiki to “dial home”, which is controversial in our developer community.

by Tim Starling at 30 July, 2010 04:34 AM

29 July, 2010

Domas Mituzas

on primary keys

5.1.46 has this change:

Performance: While looking for the shortest index for a covering index scan, the optimizer did not consider the full row length for a clustered primary key, as in InnoDB. Secondary covering indexes will now be preferred, making full table scans less likely.

In other words, if you have covering index on * (which is quite common on m:n mapping tables), use it rather than PK. As I have spent my time getting indexing right and having PKs be based on primary access pattern and SKs on secondary access pattern, I hereby not welcome the new change that suddenly reverses the behavior in late GA version.

Not good, when mysqldump queries end up taking 6 days instead of previous half an hour, not good at all.

Update: Oh, MariaDB has this reverted, from their change log:

mybug:39653: reverted as invalid

If only upstream MySQL would take note ;-)

by Domas Mituzas at 29 July, 2010 11:45 PM

OmegaWiki

#Conceptwiki

The ConceptWiki environment is an umbrella for several categories of professional data. It is hosted by NBIC / the Concept Web Alliance and it provides a rich environment with many types of data that have a bio-medical background.

The software it uses is very similar to the one used by OmegaWiki. This is due to the long association between people behind OmegaWiki and the ConceptWiki. It used to be that there was no room for multi-lingual content in the ConceptWiki but that is changing.

The hosting of OmegaWiki and the ConceptWiki used to be on the same server. this made it easy to connect OmegaWiki translations to ConceptWiki ontological content. For a concept like yaws you find many translations at OmegaWiki, you also find a link to the Wikipedia article and a mapping to the UMLS part of its database.

Indonesian dance mask

There is no Commons category yet on the subject, otherwise you might find this mask depicting the face of a sufferer of this disease.

The plan is to bring OmegaWiki and the ConceptWiki closer together again. I hope that the ConceptWiki will become a multilingual resource and we are going to start by sharing resources.
Thanks,
      GerardM

by noreply@blogger.com (GerardM) at 29 July, 2010 07:49 PM

AboutUs

Make Pay-Per-Click Ads Work for You

Pay-per-click, or PPC, can be a great strategy for driving the right people to your website.

But writing good ad copy isn’t easy. Christian Bullock, a senior account executive at search engine marketing firm Amplify Interactive, provides step-by-step instructions for writing effective PPC ads. Do it right, and you’ll get people clicking through to your site at the very moment when they’re seeking what you have to offer.

Christian has tons of experience creating PPC ads. He’s also certified by Google, Yahoo and Microsoft as a marketing and advertising professional. Use his tips, and you’ll be writing better PPC ad copy in no time at all. Bonus: Christian includes a link to a useful ad-writing tool on Amplify Interactive’s website.

Do you have some expertise in online marketing you’d like to share with business owners? Is there a topic you’d like us to cover? Send me an email: Aliza@AboutUs.org

by Aliza Earnshaw at 29 July, 2010 06:42 PM

Working Wikily

Acting bigger by activating networks

This post is third in a short series being published at the Intrepid Philanthropist. You can find the original here.

Yesterday I wrote a bit about the Strategy Landscape, an innovation that the Monitor Institute has been developing to help funders better “understand their context”—one of the 10 next practice areas we discuss in our new report, What’s Next for Philanthropy. The next practices represent principles and behaviors that are particularly well suited to the more networked, dynamic, and interdependent landscape of public problem solving that is now emerging. They’re approaches that we believe have the potential to become the widely accepted best practices of tomorrow.

The idea is that if the last decade was mostly about funders improving their individual organizational effectiveness and capacity, the work of the next 10 years will have to build on those efforts to develop next practices that also help funders ACT BIGGER and ADAPT BETTER.

  • ACT BIGGER, because given the scale and social complexity of the challenges they face, funders will increasingly look to other actors, both in philanthropy and across sectors, to activate sufficient resources to make sustainable progress on issues of shared concern. No funder alone, not even Bill Gates, has the resources and reach required to move the needle on these types of wicked problems.
  • And ADAPT BETTER, because given the pace of change today, funders will need to get smarter faster, incorporating the best available data and knowledge about what is working and regularly adjusting what they do to add value amidst the dynamic circumstances we all face.

My colleague Barbara Kibbe will be blogging after the summer about some of the ways that funders are beginning to think about adapting better, so I thought I’d write over the next day or two to explain a little more about what we mean when we talk about acting bigger.

In the report, we highlight five key ways that funders are beginning to act bigger:

  1. Understanding the context. Strong peripheral vision—seeing and developing a shared understanding of the system in which they operate—will be critical to helping funders build and coordinate resources to address large, complex problems.
  2. Picking the right tool(s) for the job. Funders have a wide range of assets—money, knowledge, networks, expertise, and influence—that can be applied deliberately to create social change.
  3. Aligning independent action. Philanthropies are developing new models for working together that allow for both coordination and independence. Funders don’t necessarily need to make decisions together, but they do need their efforts to add up.
  4. Activating networks. Advances in network theory and practice now allow funders to be more deliberate about supporting connectivity, coordinating networks, and thinking about how the collective impact of all of their efforts can produce change far beyond the success of any single grant, grantee, or donor.
  5. Leveraging others’ resources. Funders can use their independent resources as levers to catalyze much larger streams of funding and activity from other sources by stimulating markets, influencing public opinion and policy, and activating new players and assets.

The Strategy Landscape tool I discussed yesterday was an example of how funders are now beginning to develop new ways of understanding their context: how individual foundations and donors are learning to put the problem—not themselves and their organizations—at the center, and to try to recognize their role as actors within a larger ecosystem of stakeholders working on the issue.

Today, I’m going to talk briefly about another one of the next practices: activating networks.

Although the individual grant is the typical unit of analysis for most foundations, the success of any single grant or organization is rarely sufficient to move the needle on a complex problem. We’ve all felt the irony when successful programs are lauded while the system they aspire to change continues to fail.

Funders are well positioned to support connectivity and to coordinate and knit together the pieces of a network of activity that can have impacts far beyond the success of any one grant, grantee, or donor. And advances in network theory and practice now allow funders to be much more deliberate about supporting and participating in networks and in thinking about how the collective impact of a coordinated portfolio of grants can produce more significant change.

One of my favorite examples of how funders are already using networked approaches to act bigger comes from the Barr Foundation in Boston. The foundation’s Barr Fellows program aims to explicitly build a stronger network of civic leadership in the Boston area by providing fellowships and other activities to cohorts of nonprofit and other area leaders. The program offers the fellows a three month sabbatical and a number of retreats and other connective activities, including an international trip, over the course of three years. The idea is to help the fellows build the trust and relationships that, once they return to their jobs (refreshed and inspired), will allow them to self-organize and work together in the future as needs arise. Instead of simply supporting a group working on a single, specific issue, Barr is building the bonds and linkages that create a robust network that can respond more effectively to challenges of all sorts over time.

The Fellows program is just one example of how Barr is explicitly building networks to advance its programmatic goals. In other cases, they’ve used social network mapping to help groups of local green and healthy building advocates better recognize common goals, and have supported “network weavers” to help build local connections and capacity around after school programming.

Barr is just one foundation working on the cutting edge of networks. At the Monitor Institute, we’ve been working with a number of different funders over the last several years, particularly the David and Lucile Packard Foundation, to develop tools and training curriculum that help them and their grantees better understand, build, support, and work as part of networks.

If you’re interested in learning more about how funders and nonprofits can use networks to advance their work, there are a growing number of great resources out there: Beth Kanter and Allison Fine’s new book The Networked Nonprofit: Connecting with Social Media to Drive Change (or Beth and Allison’s respective blogs), Pete Plastrik and Madeleine Taylor’s Net Gains: A Handbook for Network Builders Seeking Social Change, Clay Shirky’s Here Comes Everybody: The Power of Organizing Without Organizations, and even the Monitor Institute’s own Working Wikily article (and its accompanying blog).

?

by Gabriel Kasper at 29 July, 2010 04:39 PM

Blog on Wiki Patterns

“At Least iPod, if Not iPad Generation”

Patrick Wintour, describing new British PM David Cameron in The Guardian:

Cameron feels at least iPod, if not iPad generation: relaxed appearances on the networks, dinner with Washington Post columnists, and meetings with all the key congressional leaders…

by Stewart Mader at 29 July, 2010 03:18 PM

A Thought on Inspiration and Aspiration

Whitney Hess:

Inspiration is necessary in order for aspiration to grow. But only until we’re filled with aspiration, self-initiated and self-defined, do our careers, and lives, really begin to transform, allowing us to transform those around us in turn.

by Stewart Mader at 29 July, 2010 02:53 PM

OmegaWiki

Endangered African languages

Sorosoro is a program that aims at studying and documenting endangered languages.

They have recently published videos of native people giving some vocabulary in four endangered African languages: Punu, Mpongwe, Akele and Benga. The words are about body parts, numbers, colors and common phrases.



I have enabled these languages for editing at OmegaWiki and added all the words mentioned in the videos. By doing so, these translations are not only available to people speaking English, French or Spanish (the three languages of the video), but to all the other languages at OmegaWiki.

We now have 28 expressions in Akele and Benga, 59 in Mpongwe and 78 in Punu.

If you know about more resources (vocabulary) for endangered languages, you are more than welcome to mention them.

Thanks,
Kipcool.

by noreply@blogger.com (Kipcool) at 29 July, 2010 12:38 PM

EditMe

Recycle Bin for Deleted Pages and Attachments

A small change made today makes it a bit simpler to keep your EditMe sites tidy.

29 July, 2010 12:17 PM

Piotr Konieczny

Briding of global digital divide

I was looking at some archival stats on Internet usage.

Compare the Top 10 Internet languages in 2004, 2005, 2006, 2007, 2008 and the latest (Dec 2009 for me, your mileage may vary as that link is dynamic).

Look at the 2004 numbers: it is very much the core, developed, First world, China aside.

Still, its pitiful penetration numbers aside... what was (is?) wrong with France, Spain and Portugal? Their penetration numbers are China-like (~10%), whereas the rest of Europea was around 50%.

Now, look at 2010. In addition to China, whose penetration number have improved greatly (from 2004's 8% to 29.7%), note the inclusion of Arabic (17.5%) and Russian (32.3%). 

Oh, don't get confused with langauges and countries. I was for a moment shocked with low penetration numbers for French, Spanish and Portugal, but remember - those languages are spoken in many periphery, developing, Third World countries in Africa and Latin & South America- thus the pitiful results (here are EU stats, and here is a breakdown for Spanish language, for example).

One of the most interesting numbers, for me, is the diminishing percentage of English-speakers as the % of Internet users - from 35.9% in 2004 to 30.1% in 2009.

And how about tripling of Internet users totals in non-Top 10 languages (from 100m to 300m), or nearly doubling of Internet users totals (from 800m to 1,400m) in that period?

Around 2006 the stat site started to report "Internet Growth for Language (since 2000)". For 2006, it was 128% for English, 346% for Chinese, 436% for non-Top 10 languages, and 189% for all languages. In 2007, Arabic enters the chart, with 940% growth (!). For 2009 we have 251.7% for English, 1162% for Chines, 2297% for Arabic, 525% for non-Top-10 languages and 400% for the world.

Global digital divide still exists, no doubt about that.  But it is being bridged, as the rest of the world is catching up. 'bout time...

PS. Here's an interesting take on this from The Economist.

by Piotr Konieczny (noreply@blogger.com) at 29 July, 2010 12:14 PM

Appropedia Blog

Tell your friends

Today I chatted to a stranger at an Indian diner, and when I mentioned Appropedia, he asked me to send him some links for his friend, who is a wastewater engineer.

I'll share my message here, in case you know a water or wastewater engineer, or other knowledgeable person, and perhaps to prompt you to invite knowledge friends to contribute.

Hi ------,

Nice to meet you today. A few links for your friend -----, from our collaborative website, Appropedia.org:

Note that it covers all kinds of contexts, and there is an emphasis on cost-effectiveness and applications where resources are limited.

If he is interested or knows anyone else who might be, we' re always in need of people to click "edit" and share their knowledge.

Thank you,

------

Digg This  Reddit This  Stumble Now!  Buzz This  Vote on DZone  Share on Facebook  Bookmark this on Delicious  Kick It on DotNetKicks.com  Shout it  Share on LinkedIn  Bookmark this on Technorati  Post on Twitter  Google Buzz (aka. Google Reader)  

by Chriswaterguy at 29 July, 2010 08:02 AM

Blog on Wiki Patterns

Photos: President Obama’s Motorcade in Lower Manhattan

This was a fun way to end a busy day. As I was walking home, the President’s motorcade came down Houston Street on the way to an event in Greenwich Village. I took the first photo using an iPhone, and went out later with my Sony Alpha 300 to catch the motorcade leaving Manhattan.

East Houston Street, 7:17 PM – Presidential State Car (top right) followed by secret service car.

FDR Drive South, 8:55 PM – Presidential State Car (left) following secret service car.

FDR Drive South, 8:55 PM – NYPD squad cars and secret service SUVs follow the President to the Downtown Manhattan Heliport.

by Stewart Mader at 29 July, 2010 02:01 AM

28 July, 2010

Gerard Meijssen

#Tropenmuseum brings us "stock-puppets" and more

The collections that we were happy to receive in #Commons were typically old photographs or pictures of all kinds. This time the Tropenmuseum brings us something new: pictures of objects. Objects like these amazing wayang puppets.

screenshot from the Commons category

There are manuscripts in many languages on many materials, statues, Korans, jewellery, weapons, miniatures, musical instruments, textiles, pottery, proclamations, designs for silver jewellery ... the list goes on.


A paddy field fisherman with a fishing rod.

This new collection is vast. Multichil started to upload the new pictures and really, there is so much that it will take real effort to grasp what it is that we received from the Tropenmuseum.
Thanks,
      GerardM

by noreply@blogger.com (GerardM) at 28 July, 2010 10:01 PM

Daniel Kinzler

Neo4j

Free Content

neo4j is a graph database written in Java (neo4j.org). I recently poked at it a little to see if it could be used to make fast queries over Wikipedia's category structure.

The Problem

Using the category structure when searching content on Wikipedia, or when looking for maintenance task in a specific topic area, has long been a pending item on the wishlist of a lot of people. Some years back, I wrote catscan to address the issue, but it's slow, truncates results, prone to failure, and generally ugly. So I'm looking for better ways to do this, and neo4j looks like an option.

But first off, a closer look at the problem: Categories on Wikipedia are not tags: they can't easily be combined (intersected), but they can put into relation to each other (making subcategories). A category can be a subcategoriy of several other categories: American Writers may be a subcategory of American people and Writers. By convention, there should be a single root category, and there should be no circles in the category structure, so the resulting graph is a directed graph that has no circles and is (weakly) connected. This is alsy called a poly-hierarchy. However, there is nothing that actually prevents circles, and nothing that forces the structure to be connected. So, both loops and islands may occur.

The most wanted feature now is commonsly called deep category intersection: we want all pages that are contained in two categories, while also considering all of their subcategories. Formally, this is the intersection of the transitive closure of the two categories alon the subcategory-relation. Calculating the transitive closure is typically done by recursively evaluating all subcategories. However, this is something traditional relational database systems are particularly bad at - it's only possible with lots of individual queries, which makes the proces quite slow.

== The Idea == [...Neo4j...]

by Daniel at BrightByte at 28 July, 2010 08:15 PM

EditMe

Display Recent Changes to the Current Page

A new include script is now available that displays recent changes to the current page. You might include this in your site's Menu area to provide insight into what's been going on with the page being viewed.

28 July, 2010 04:39 PM

Working Wikily

Tools: making It easier to work in new ways

This post is second in a short series being published at the Intrepid Philanthropist. You can find the original here.

Before I dive into some of the different “next practices” highlighted yesterday that we think may become important parts of philanthropy’s future, I wanted to first say a few words about one of the key pieces of what I think it’ll actually take for funders to start acting bigger and adapting better over the next decade.

Change in philanthropy is especially hard. As organizational theorist Edgar Schein puts it, the only time that organizations learn and change is when the normal level of “learning anxiety”—the anxiety produced by having to shift and learn something new—is trumped by “survival anxiety”—the anxiety produced upon realizing that if you don’t adapt or improve you’ll be forced out of your position or out of business entirely.

Among many endowed philanthropic institutions, there is almost never a threat that raises survival anxiety, which means, in turn, that there is nothing that forces philanthropic organizations to get over their learning anxiety in any consistent way. The result is a field in which many of the most powerful players have few (if any) incentives to prompt adaptation and behavior change.

For years, we’ve joked that this dynamic has left philanthropy with a unique set of “learning disabilities” that get in the way of change in the field. In What’s Next for Philanthropy, we look deeply at a number of these different barriers: funders’ need for independence and control, their insularity and inward focus, the cumulative effects of caution and risk aversion, the challenges of time and inertia, and the dangers of an unspoken competition for reputation and credit.

Because the field is both voluntary and independent by nature—unconstrained by the need to please political constituencies or maintain shareholder value—these challenges add up to a situation where there is no pressure that forces any one actor to respond to another, to learn, or to change course. Individual philanthropists and institutions can act without much reference to the success or failure of their efforts or to what others do.

The result is a system with no natural mechanism for coordinating effort, for learning, for sharing knowledge about what does and doesn’t work, or for adapting to shifting circumstances. And given that learning and adaptation are optional in philanthropy, it’s hard for the field to overcome the inertia of the status quo.

Which is why I’ve become obsessed over the last year or so with tools. Is it possible to facilitate change in philanthropy by building tools that make it easier for funders to do the “right” things and harder to do the “wrong” things? The status quo is typically the easiest road to follow. But what if we could create new tools that make the path to new behaviors just as easy?

The problem is that the barriers I mentioned above make adopting new tools in philanthropy extremely difficult. Top-down, centralized, sector-wide tools and infrastructure are often rejected, even if they could improve performance. And at the same time, bottom-up innovations—individual foundations creating specific solutions to their particular problems and circumstances—rarely spread or scale. One foundation’s innovation remains just that: one foundation’s innovation.

For funders to truly begin acting in new ways, we will need to begin to merge top-down and bottom-up mindsets to develop new tools and platforms that help individual funders do their own work better, while at the same time designing with interoperability in mind. That way the data and knowledge gathered by one actor can be integrated with that gathered by others, with modest investments of money and time. Since many funders face similar issues, tools and behaviors developed to solve specific problems—but with an interoperable mindset—can begin to build useful and healthy platforms, standards, and conventions that cross institutions to add up to something more powerful than just multiple individual solutions.

Over the last two years, we’ve been working with the Rockefeller Foundation to experiment with developing these types of new, interoperable tools.

The first innovation we’ve produced, the Strategy Landscape, was created as a way of helping visualize the strategies and grants both within, and across, foundations. The project aimed to solve a key internal challenge that the Rockefeller Foundation was facing: how to help people understand the connection between its different programs’ strategies and their grants. It was an important issue for Rockefeller, but also a problem faced by many other funders as well.

So we began building a tool that could help the foundation easily visualize and understand how the strategies and grants of each of its various initiatives are aligned. But we designed the tool with special attention to interoperability—how it could also be used by, and across, other funders. When seen as a collective platform, the tool actually becomes more than just an assortment of individual maps that allow us to better understand each foundation; it allows us to start mashing the maps together so we can see the entire landscape of strategies and grants across the different funders.

In the past, foundations have done their work essentially flying blind, unclear about what others with similar interests are doing, and without a clear picture of the ecosystem of funders around them. It simply took too much effort to know what everyone else was doing. You could spend all day on the phone or in meetings with other funders trying to find out what they’re funding, and still only come out of it with a vague sense of what they’re up to.

The Strategy Landscape aims to make it simple for funders to see what their peers are supporting, making it easier than ever before to see gaps and overlaps between foundations, to identify new opportunities for coordination and collaboration, and to develop strategy with an understanding of the larger system in which they operate. The goal is to make it so easy to see and understand the broader context that you would actually have to make a choice not to see the landscape of funders around you.

The first prototype of the Strategy Landscape is now being developed to map philanthropic funding flows related to climate change across more than a dozen funders, and we will be testing it with numerous other issues over the coming year.

But we also hope that the tool will help to kick off a wave of new innovation in philanthropy—the first of many new approaches that will span across all of the different areas of next practice that I’ll be discussing over the next few days.

by Gabriel Kasper at 28 July, 2010 04:28 PM

Jeroen De Dauw

Maps and Semantic Maps 0.6.5 released

Maps and Semantic Maps 0.6.5 are now available for download. This release contains mainly internal changes to improve code modularity and fix some security concerns. Several bugs have been fixed as well, and a new hook has been added to Semantic Maps. This hook will get you the map format as default one for queries where you only ask for coordinates when using SMW 1.5.2 or above. For a full list of changes since 0.6.4 see changes to Maps and changes to SM. Everyone running 0.6.2 or older is advised to upgrade as soon as possible. 

This release is notable for it being the first one in which I’m happy with the code-base as a whole. It took me a year to get here, but now I think the way the mapping extensions work is good and solid. This means you can now extend Maps and not be afraid the code will be incompatible in a few weeks due to changes. This also means that I’ll be focusing more on actual functionality rather then refactoring in future releases. I’ll be progressively building a little guide that explains how the extensions work from a developers perspective and how to extend them.

I might release another minor update in the 0.6.x series if any significant issues are found in 0.6.5. Further plans are finishing up a bunch of changes I’ve started to make in Validator, which I’ll probably release as 0.4 then, and to start working on Maps and Semantic Maps 0.7, which would aim at adding new features and improving existing ones. A likely new feature I’m particularly looking forward to implementing is several tag extensions that do the equivalent of the current parser functions added by Maps. The timetable for all this depends a lot on which other things I get cough up in (I’ll probably continue putting effort into the deployment stuff for my GSoC project) and what kind of funding will be available.

Downloads:

  • Maps 0.6.5 [zip - 7z]
  • Maps and Semantic Maps 0.6.5 [zip - 7z]

You can also view the release announcement at the documentation wiki.

Digg This  Reddit This  Stumble Now!  Buzz This  Vote on DZone  Share on Facebook  Bookmark this on Delicious  Kick It on DotNetKicks.com  Shout it  Share on LinkedIn  Bookmark this on Technorati  Post on Twitter  Google Buzz (aka. Google Reader)  

by Jeroen De Dauw at 28 July, 2010 04:01 PM

Gerard Meijssen

Failing #statistics (finally)

Now that Erik Zachte announced issues with the statistics as published, it is a good moment to reflect. It is the aim of the Wikimedia Foundation to double its reach in five years time, doubling our traffic. The expected result is expressed numerically and consequently we require hard numbers.

There are numbers of Wikimedia's traffic by the likes of Alexa and comScore, so there are alternative numbers providing us with a second opinion. Their numbers while good are no alternative for the numbers needed for our own purposes.

The numbers are used in many ways and for many audiences. They are important for the GLAM's that contributed material to us. These same numbers provide the arguments to other GLAMs to work with us. They are used to learn how a competition is doing. They provide background numbers when we talk to the press on many subjects.

Our statistics are vital. When I asked for a slot for a panel discussion at Wikimania about statistics, the numbers ended up being quite different. I am now at a loss how to appreciate the numbers we have. I understand that some statistics will be approximated to what they should have been. Other numbers will not receive such royal treatment.

This mishap is painful and I really hope it is felt that way. As we have several people working professionally on statistics, as many studies are done based on the numbers we provide, as the Toolserver is another resource that relies heavily on us accruing the right numbers, it is fair to call statistics one of our primary processes.

For our other databases we have redundancies, I hope that we will learn from those responsible for the accumulation of data that our statistics are based upon how our data collection will be made more robust in the future.
Thanks,
       GerardM

by noreply@blogger.com (GerardM) at 28 July, 2010 09:49 AM

Wikimedia Technical Blog

MediaWiki 1.16.0

We are proud to announce the first stable release of the 1.16 series. Selected changes that may be of interest since MediaWiki 1.15 are:

  • Watchlists now have RSS/Atom feeds. RSS feeds generally are now hidden, since Atom is a better protocol and is supported by virtually all clients.
  • It’s now possible to block users from sending email via Special:Emailuser.
  • The maintenance script system was overhauled. Most maintenance scripts now have a useful help page when you run them with –help.
  • AdminSettings.php is no longer required in order to run maintenance scripts. You can just set $wgDBadminuser and $wgDBadminpassword in your LocalSettings.php instead.
  • The preferences system was overhauled. Preferences are stored in a more compact format. Changes to site default preferences will automatically affect all users who have not chosen a different preference.
  • Support for SQLite was improved. Some broken features were fixed, and it now has an efficient full-text search.
  • The user groups ACL system was improved by allowing rights to be revoked, instead of just granted.
  • A new localisation caching system was introduced, which will make MediaWiki faster for almost everyone, especially when lots of extensions are enabled.

By default, this new system makes a lot of database queries. If your database is particularly slow, or if your system administrator limits your query count, or if you want to squeeze as much performance as possible out of Mediawiki, set $wgCacheDirectory to a writable path on the local filesystem. Make sure you have the DBA extension for PHP installed, this will improve performance further.

MediaWiki 1.15.5 was also released today. Both MediaWiki 1.15.5 and 1.16.0 contain important security fixes. For further details please read the release announcement.

by Tim Starling at 28 July, 2010 08:26 AM

Guillaume Paumier

WikiSym 2010

Two Three weeks ago, I attended the WikiSym 2010 conference. WikiSym is the “International Symposium on Wikis and Open Collaboration”; it’s sort of the “Academic Wikimania”, where people researching wikis, Wikipedia and generally open collaboration get together and share their findings.

Banner containing the text WikiSym 2010 and a tagline, on a fading background image depicting warehouses

The WikiSym 2010 banner, designed by yours truly (except for the logo, by David Bailey).

WikiSym & Wikimania in Gdańsk

I couldn’t attend WikiSym the previous years for various reasons, the main one being money: they were too far away and the registration fee was too expensive. This year, WikiSym was collocated with Wikimania in Gdańsk, Poland, so it was a perfect opportunity for researchers & Wikimedians (or “practitioners”, as researchers call them) to get together and meet. We have to thank my friend Phoebe Ayers for that, who was this year’s Chair/Organizer of WikiSym.

I was pretty excited because WikiSym seemed to be at the crossroads of two of my circles: academia & open collaboration. I was also really looking forward to meeting researchers: the Wikimedia Foundation is currently engaged in an effort to include research into their decision-making process in order to make it more data-driven. Thus, it was the perfect time to try and build awareness, understanding and relationships between the two communities.

A set of carton boards of several colors, each marked with a specific time and symbol

Open space schedule boards at WikiSym 2010

Program & Sessions

The symposium was a mix of regular conference talks and an unconference-style Open space track. I wasn’t necessarily a big fan of the unconference style, but one of the most productive discussions I had (about quality assessment tools) actually happened in a group I walked in a bit randomly. I was a bit disappointed by the quality of some talks, but overall the event was great.

One major issue, though, was the number of conflicting talks: there were up to eight concurrent open space sessions, conflicting with each other, as well as with the main sessions and workshops. It was just impossible to take part in everything one was interested in. While this is a usual problem with large events, I didn’t expect to have this issue in a relatively small conference like WikiSym.

I gave a presentation entitled Understanding the users of Wikimedia Commons, a summary of the user research I did for the Multimedia usability project, and it was pretty well received. The audience particularly liked the video I showed from our UX study. The supporting slides are available on Commons  (download the PDF – 504 KB). Unfortunately, the presentation wasn’t recorded, but it was similar to the one I gave at Wikimania, whose recording will be available soonish.

Guillaume Paumier on a stage giving a presentation, with a huge screen behind him

Me trying not to burst into song, being on stage in such a nice Concert hall (CC-by-sa by Blue Oxen Associates)

Not as open as you might think

The second day ended with a discussion in the “Open circle” about copyright. Specifically, the participants asked if they could publish their work (that they presented at WikiSym) under a free license. I was particularly interested, since I had had the very same discussion a few months before.

A large room with wooden floor, and a few dozen chairs assembled in a circle with several rows

« Open circle » at WikiSym 2010

In March 2010, I submitted a scientific paper to WikiSym about my work. I had written papers for scientific journals & conferences before, but it was the first time I submitted one in this specific field of research. As a consequence, I was quite happy when my paper was accepted.

Then came the copyright transfer issue. WikiSym partnered with the ACM to publish the proceedings of the conference, and the ACM asked me to transfer my copyright to them. While this is fairly standard in the scientific publishing industry1, I was surprised by this requirement considering the field of research involved (open collaboration and free knowledge).

I shared my concerns with Phoebe and with Felipe Ortega (Chair of the Program Committee), who reached out to the ACM. The ACM wouldn’t let me release my work under a free license such as Creative Commons Attribution Share Alike (CC-by-sa). I felt my research belonged to the Wikimedia community, and I didn’t want to enclose my work within the ACM’s intellectual property prison.

Hence, I refused to sign the copyright transfer form, even though it meant not being able to present my work at WikiSym. In the end, thanks to Phoebe & Felipe’s efforts and discussions with the WikiSym committee, I was allowed to present my work, but only as a lightning talk, and it wasn’t included into the conference proceedings.

I do hope, though, that at some point we’ll be able to move towards a more open access & reuse model2, in accord with the philosophy of open collaboration and free works.

Notes

  1. Standard and outrageous, but you don’t have much flexibility when your degree or career depends on it, unless your employer/university encourages you to publish in Open Access journals (like the MIT does).
  2. “Gold”, and not just “green” open access. See The ACM is NOT Open Access, by Michael Mitzenmacher, for more information, and More on the Author Addendum Kerfuffle (and Counterproductive Over-Reaching), by Stevan Harnad, for an opposing view.

by Guillaume Paumier at 28 July, 2010 12:46 AM

27 July, 2010

Erik Zachte

Wikimedia page views, some good and bad news

First the good news: there is new summary report for Wikimedia page views that presents trends for nearly all projects on a single page.

Now the bad news: a few days ago it was established that the server that collects and aggregates log data for all squids could not keep up with all incoming messages, and hence underreported page views. When I suggested that recent page view trends looked very suspicious Tim Starling and Mark Bergsma quickly analyzed the cause and fixed server overload. Kudos to them. For April - July 2010 I could still infer the amount of underreporting from available log files. Counts for these months have been corrected. For earlier months, possibly from Nov 2009 till March 2010 counts are still too low. For details on the error correction see this pdf.

Reports affected: all wikistats reports that are based on dammit.lt hourly log files are affected, notably page view reports and server request reports. The same goes for the monthly Report Card. Earlier editions of the monthly server request reports are not yet corrected like the page view trend reports (maybe just a notice will be added), and of course even though absolute numbers are too low, comparisons are not affected (e.g. market share per browser or OS) . Other sites that build on these log data will be also affected, notably stats.grok.se , trendingtopics, amaglamate.

by Erik at 27 July, 2010 11:31 PM

Gerard Meijssen

A skin for a vertical script


The continued work that is done to enable #SignWriting is awesome. The SignWriting script is written top down and, this needs to be reflected in the user interface, the skin. I love the design..

I wonder what a Wikipedia logo would look like :)



Anyway, here is also the video that goes with the blog post. It is in ASL and, the actual blog post does not feature the new skin yet... :)
Thanks,
      GerardM

by noreply@blogger.com (GerardM) at 27 July, 2010 10:18 PM

AboutUs

Help Google Find Your Web Pages

Sitemap for EnergyTrust.org


You’ve put time, energy — and maybe even money — into building a website for your business. Now how do you tell search engines you’re here?

AboutUs community manager Kristina Weis shares the ins and outs of building a simple HTML sitemap in her latest article for website owners. She explains how a sitemap can up your chances of having your site appear in search results, and tells you how to figure out which type of sitemap is most appropriate for your website.

We want to offer AboutUs.org visitors the best guidance and expertise about using the web to grow one’s business. Please let us know what you want to read about — or let us know what you could write about — by sending me an email: Aliza@AboutUs.org.

by Aliza Earnshaw at 27 July, 2010 07:15 PM

Working Wikily

Innovating next practices for philanthropy’s next decade

This post is the first in a short series being published at the Intrepid Philanthropist. You can find the original here.

When the Monitor Institute first started its exploration of the evolving “future of philanthropy” ten years ago, I was one of its funders, a program officer at the Packard Foundation. A big part of what we were trying to do was to create an urgency and an awareness that the world around philanthropy was changing, and that if philanthropy was going to remain relevant and achieve its potential in the coming years, the field—and the institutions and individuals within it—were going to need to change too.

Now, ten years and a financial crisis (or two, if you want to count the dot-com bust) later, I’m working on the other side of the coin. The challenge is no longer about convincing anyone that the world around philanthropy is changing. An intimidating range of forces—blurring sectoral roles, new connective technologies, and globalization—are transforming the landscape of public problem solving. We face “wicked problems” (to borrow the language of design theorist Horst Rittel)—large, complex social and environmental challenges that don’t adhere to traditional geographic and disciplinary boundaries, and where both the problem and the solution are often unclear and shifting.

And in this new landscape, the question isn’t about whether and how the world is changing. It’s about how funders can have a greater impact in a world that’s already shifting—and will continue to do so.

But as organized philanthropy in the United States hits its century mark, what’s quite remarkable is that many of the field’s core principles and practices remain remarkably similar to the ones created by John D. Rockefeller and Andrew Carnegie when they first created the foundation form 100 years ago. The world around philanthropy is changing much, much faster than philanthropy itself.

So the pressing question for today, and for the future, is about how funders can begin to institute, adapt, and invent practices and approaches that will better fit the emerging environment in which they work. For philanthropic and civic leaders looking to cultivate change in today’s rapidly shifting landscape, simply tweaking the status quo and adopting established best practices won’t be enough. Funders will have to pioneer “next practices”—effective approaches that are well-suited to tomorrow’s more networked, dynamic, and interdependent context.

In our new report, What’s Next for Philanthropy: Acting Bigger and Adapting Better in a Networked World, Katherine Fulton, Barbara Kibbe, and I have put forward our best thinking about the shifting landscape for philanthropy, and about ten key practices and principles that we believe can help funders achieve greater impact in the coming decade. We feel that while the cutting edge of philanthropic innovation over the last decade has been mostly about improving the effectiveness, efficiency, and responsiveness of individual organizations, the next practices of the coming 10 years will have to build on those efforts to include an additional focus on coordination and adaptation—how funders can act bigger and adapt better.

We highlight five practices that funders can use to act bigger:

  • Understand the context
  • Pick the right tool(s) for the job
  • Align independent action
  • Activate networks
  • Leverage others’ resources

And five approaches to help them adapt better in the coming decade:

  • Know what works (and what doesn’t)
  • Keep pace with change
  • Open up to new inputs
  • Share by default
  • Take smart risks

These practices are by no means new; innovative funders have been doing many of these things, and doing them well, for years. And we certainly don’t pretend that the list is in any way comprehensive. But we believe that these ten practices represent what Chip and Dan Heath (in their new book Switch) refer to as “bright spots”—instances where new strategies are showing especially great promise, especially as emerging tools and approaches catch up with the aspirations of funders in the new context.

Over the next few days, I’ll dive a bit deeper into a number of these next practices, and hope you’ll join me in starting to think intentionally about how we might innovate new ways of working that will become the best practices of the coming decade.

by Gabriel Kasper at 27 July, 2010 06:09 PM

Ziko van Dijk

Ziko

Recently, I heard that some teachers find Wikipedia articles (in German) too long and too complicated to use them at school. When I asked a school expert, he had another objection.

It became customary in the German states that pupils have to write a high school paper at the end of their school career. The idea is (among others) to lead pupils to the basics of  scientific work, citing sources. But what “sources” are you allowed to use? General reference books, local newspapers, leaflets of commercial and non profit organisations?

The official sites of the 16 German school ministries did not provide an answer, so I asked an expert recommended by North Rhine-Westphalia’s ministry.

Mr. Philipp Portscheller said that high school papers can deal with very different subjects; some pupils write about a theatre project, others about genetics or the acceleration of trains in a station. So there cannot be unified catalogues of what to use as source.

“Wikipedia is popular among pupils, but there are problems when dealing with it, because [Wikipedia] treats subjects at large, and in the end it would be sufficient, when we talk purely about the content, to print the Wikipedia [article].”

Knowing that, the tasks given to pupils are set up in a way that an encyclopedia can be used, but that it does not mean the core of the subject, Mr. Portscheller said.


by Ziko van Dijk at 27 July, 2010 01:20 PM

Aaron Swartz

You Don’t Know John (Maynard Keynes)

From the right, Gary Becker writes:

Keynes and many earlier economists emphasized that unemployment rises during recessions because nominal wage rates tend to be inflexible in the downward direction.

From the left, Matt Yglesias writes:

…the Keynesian prescription is not only for the government to run deficits in response to recessions, but to run surpluses in expansions. Thus, the Clinton administration’s fiscal policies were arguably “Keynesian” but the Reagan and (especially) George W Bush administrations were implementing an agenda that flew in the face of Keynes’ ideas much more clearly than anything Angela Merkel’s ever done.

Neither of these are true at all. Pretty much the very first thing Keynes says in the general theory is that downwardly-inflexible nominal wage rates (sticky wages) are a good thing. And he spends a large part of chapter 8 denouncing the practice of saving surpluses (sinking funds).

So where do they get this stuff? While these aren’t the views of Keynes, both these views are held by the so-called “New Keynesians” — people like Paul Krugman and Greg Mankiw, who have tried to shoehorn a moderate version of Keynes into classical economics. These proponents are rather more prominent than more traditional Keynesians like Jamie Galbraith, so political commentators hear their view and assume it’s a faithful representation of Keynes’ own.

Perhaps Keynes was wrong — after all, we shouldn’t slavishly follow the scribblings of some defunct economist. But if so, we should tell the truth and admit we’re disagreeing with Keynes, not expounding his ideas. (Both Yglesias and Becker have not run a correction, despite my emails.) Furthermore, we should actually engage with Keynes’ argument.

On sticky wages, Keynes says that if nominal wages could fall, then nominal costs would fall, which would mean that nominal prices would fall, which means that real wages would end up staying the same.1 But, even worse, if there was no stickiness at all, nothing would stop nominal wages from falling further and further until eventually everyone was paid zero.2 I have never heard the New Keynesians respond to this argument.

On the question of surpluses, Keynes criticizes them as a pointless reduction of aggregate demand. They create unemployment because they take money out of circulation for no real purpose. It’s just supposed to sit around until a “rainy day” when the economy isn’t doing so well. But when that rainy day comes, the reason the economy isn’t doing well is because people are out of work. If that’s true, you can simply print more money to get them back to work without any ill effects. (Printing money only causes inflation at full employment.) You don’t get any benefit from having taken the money out of circulation earlier.3

Both these seem like strong arguments to me. Perhaps that’s why it’s easier to pretend they don’t exist.


  1. Chapter 2:

    …if money-wages change, one would have expected the classical school to argue that prices would change in almost the same proportion, leaving the real wage and the level of unemployment practically the same as before, any small gain or loss to labour being at the expense or profit of other elements of marginal cost which have been left unaltered.

  2. Chapter 21:

    If, on the contrary, money-wages were to fall without limit whenever there was a tendency for less than full employment, the asymmetry would, indeed, disappear. But in that case there would be no resting-place below full employment until either the rate of interest was incapable of falling further or wages were zero. In fact we must have some factor, the value of which in terms of money is, if not fixed, at least sticky, to give us any stability of values in a monetary system.

  3. Chapter 8:

    We must also take account of the effect on the aggregate propensity to consume of Government sinking funds for the discharge of debt paid for out of ordinary taxation. For these represent a species of corporate saving, so that a policy of substantial sinking funds must be regarded in given circumstances as reducing the propensity to consume. It is for this reason that a change-over from a policy of Government borrowing to the opposite policy of providing sinking funds (or vice versa) is capable of causing a severe contraction (or marked expansion) of effective demand.

    […]

    Or again, in Great Britain at the present time (1935) [thanks to] the principles of “sound” finance [sinking funds are so large] that even if private individuals were ready to spend the whole of their net incomes it would be a severe task to restore full employment…The sinking funds of local authorities now stand … at an annual figure of more than half the amount which these authorities are spending on the whole of their new developments. [footnote giving the amounts] Yet it is not certain that the Ministry of Health are aware, when they insist on stiff sinking funds by local authorities, how much they may be aggravating the problem of unemployment.

27 July, 2010 12:46 PM

Wikipedia Signpost

EditMe

Product Release: Page Organizer Redesign

A new release today makes managing the content on your EditMe site even easier. You can now view and manage the entire tree navigation from the editor of any page with simple drag and drop.

27 July, 2010 02:16 AM

26 July, 2010

Samuel Klein

Afghanistan memos

Not papers, but still: Wow. (The Guardian on Wikileaks)

by metasj at 26 July, 2010 10:37 PM

25 July, 2010

David Gerard

Link pile.

  • “Let’s you and him fight! … What do you mean, you’re going to cooperate? Cowards!
  • Boy, is Citizendium dead. Specifically, the comments: aggrieved academics burnt by CZ versus the North Korean press office. Car crash television. The objective numbers, with graphs and post-mortem.
  • New RationalWiki: Adi Da, arXiv, Biofeedback, Ezekiel’s wheel, Luboš Motl. Also, InstantCommons is way cool.
  • By the way: reward quality investigative journalism and give WikiLeaks some cash. And buy these issues of NYT, der Spiegel and the Guardian and tell them why.
  • by David Gerard at 25 July, 2010 11:29 PM

    Sue Gardner

    suegardner

    Since joining the Wikimedia Foundation, I’ve hired about 25 people. That means I’ve read thousands of CVs, done hundreds of pre-interview e-mail exchanges and phone calls, and participated in about 150 formal interviews.

    With each hire I’ve –and the Wikimedia Foundation as a whole has– gotten smarter about what kinds of people flourish at Wikimedia, and why. The purpose of this post is to share some of what we’ve learned, particularly for people who may be thinking about applying for open positions with us, or participating in our open hiring call.

    Let me start with this: The Wikimedia Foundation’s not a typical workplace.

    Every CEO believes his or her organization is a special snowflake: it’s essential that we believe it, whether or not it’s true.  And when I first joined Wikimedia, my board of trustees would tell me how unusual we were, and I would nod and smile.  But really.  Once I worked through some initial skepticism, it became obvious that yeah, Wikimedia is utterly unique.

    Viewed through one lens, the Wikimedia Foundation is a scrappy start-up with all the experimentation and chaos that implies. But, it’s also a non-profit, which means we have an obligation to donors to behave responsibly and frugally, and to be accountable and transparent about what we’re doing. We’re a top five, super-famous website, which brings additional scrutiny and responsibility. We work closely with Wikimedia volunteers around the world, many of whom are hyper-intelligent, opinionated, and fiercely protective of what they have created.   And, our role is to make information freely available to everyone around the world — which means we are more radical than, at first glance, we might appear.

    None of those characteristics is, by itself, all that unusual.  (Except the super-smart volunteers. They are pretty rare.)   But our particular combination is unique, which means that the combination of traits that makes someone a perfect employee for us is unique as well.   Here’s what I look for.

    Passion for the Wikimedia mission. This is obvious. We’re facilitating the work of millions of ordinary people from around the world —helping them come together to freely, easily, share what they know.  We’re responsible for the largest repository of information in human history: more than 16 million articles in 270 languages, accessible to people all over the world.   If people aren’t super-excited about that, they have no business working with us.

    Self-sufficiency and independence. The Wikimedia Foundation is not a smoothly-sailing ship: we’re building our ship. That means roles-and-responsibilities aren’t always clear, systems and procedures haven’t been tested and refined over time, and there isn’t going to be somebody standing over people’s shoulders telling them what to do. People who work for the Wikimedia Foundation need to be able to get stuff done without a fixed rulebook or a lot of prodding.

    That’s normal for all young organizations.

    But we’re looking for more than just self-sufficiency.  We have found that a streak of iconoclasm is a really strong predictor of success at Wikimedia.

    Wikipedia is edited by everyone: contributors represent a dizzying array of socio-political values and beliefs and experiences, as well as different ages, religions, sexualities, geographies, and so forth.  In our hiring, we tell people that it isn’t a question of whether working at Wikimedia will push their buttons; it’s just a question of how they will respond once it happens. People who’ve never examined their own assumptions, who embrace received wisdom, who place their trust in credentials and authority: they will not thrive at Wikimedia. And people who are motivated by conventional status indicators: a big office, a big salary, a lot of deference — they won’t either.

    An inventive spirit. People who fit in well at Wikimedia tend to like new ideas, to be curious, and driven towards continual improvement. This manifests in simple, obvious ways – they read widely; they like gadgets and puzzles; they make stuff for fun. They are optimists and tinkerers.

    Openness. At Wikimedia, we look for evidence that applicants have deliberately stretched themselves and sought out new experiences – maybe they’ve lived outside their home country, they read outside their comfort zone, they’ve explored other belief systems.

    Openness means people like to be challenged. They like kicking around ideas, they naturally share and communicate, they’re not defensive or unhealthily competitive. They’re comfortable interacting with a wide range of people, and people are comfortable with them.

    Lastly, we look for orientation towards scalability. The Wikimedia Foundation is a very small group of people.   It achieves impact by working through and with large numbers of volunteers – the millions of people around the world who create 99.9% of the value in the Wikimedia projects.   So in our hiring, we look for people who are oriented towards scale: who reflexively document and share information, who write easily and fluently, who take advantage of channels for mass communication and who instinctively organize and support the work of others.

    If I ran Der Spiegel or Yelp or the ACLU, the traits I’d be looking for would be different. (When I worked at the Canadian Broadcasting Corporation, the people I hired were quite different from the ones I hire today.)  And this list will change over time, as the organization changes. This is the list that works for the Wikimedia Foundation, today.


    Filed under: Hiring, Wikimedia Foundation, Wikipedia, Workplace Culture

    by Sue Gardner at 25 July, 2010 06:38 PM

    Domas Mituzas

    MySQL versions at Wikipedia

    More of information about how we handle database stuff can be found in some of my talks.

    Lately I hear people questioning database software choices we made at Wikipedia, and I’d like to point out, that…

    Wikipedia database infrastructure needs are remarkably boring.

    We have worked a lot on having majority of site workload handled by edge HTTP caches, and some of most database intensive code (our parsing pipeline) is well absorbed by just 160G of memcached arena, residing on our web servers.

    Also, major issue with our databases is finding the right balance between storage space (even though text is stored in ‘external store’, which is just set of machines with lots of large slow disks) – we store information about every revision, every link, every edit – and available I/O performance per dollar for that kind of space needed.

    As a platform of choice we use X4240s (I advertised it before) – 16 SAS disks in compact 2u package. There’s relatively small hot area (we have 1:10 RAM/DB ratio), and quite a long tail of various stuff we have to serve.

    The whole database is just six shards, each getting up to 20k read queries a second (single server can handle that), and few hundred writes (binlog is under 100k/s – nothing too fancy). We have overprovisioned some hardware for slightly higher availability – we don’t have always available on-site resources – the slightly humorous logic is

    we need four servers, in case one goes down, another will be accidentally brought down by fixing person, then you got one to use as a source of recovery and remaining one to run the site.

    Application doesn’t have too many really expensive queries, and those aren’t the biggest share of our workload. Database by itself is minor part of where application code spends time (looking at profiling now – only 6% of busy application time is inside database, memcached is even less, Lucene is way up with 11%). This is remarkably good shape to be at, and it is much better than what we used to have when we had to deal with insane (“explosive”) growth. I am sure, pretty much anything deployed now (even sqlite!) will work just fine, but what we used has been created during bad times.

    Bad times didn’t mean that everything was absolutely overloaded, it was more that it could get overloaded very soon, if we don’t take appropriate measures, and our fundraisers were much tinier back then. We were using 6-disk RAID-0 boxes to be able to sustain good performance and have required disk space at the same time (or of course, go expensive SAN route).

    While the mainstream MySQL development with its leadership back then was headed towards implementing all sorts of features that didn’t mean anything to our environment (and from various discussions I had with lots of people, many many other web environments):

    • utf8 support that didn’t support Unicode
    • Prepared Statements that don’t really make much sense in PHP/C environments
    • unoptimized subqueries, that allow people to write shitty performing queries
    • later in 5.0 – views, stored routines, triggers
    • etc…

    … nobody was really looking at MySQL performance at that time, and it could have insane performance regressions (“nobody runs these things anyway”, like ‘SHOW STATUS’) and a forest full of low hanging fruits.
    From operations perspective it wasn’t perfect either – replication didn’t survive crashes, crash recovery was taking forever, etc.

    Thats when Google released their set of patches for 4.0, which immediately provided incredible amount of fixes (thats what I wrote about it back then). To highlight some of introduced changes:

    • Crash-safe replication (replication position is stored inside InnoDB along with transaction state) – this allowed to run slaves with innodb log flushing turned off on slaves and having consistent recovery, vanilla MySQL doesn’t have that yet, Percona added this to XtraDB at some point in time
    • Guaranteed InnoDB concurrency slot for replication thread – however loaded the server is, replication does not get queued outside and can proceed. This allowed us to have way more load pointed towards MySQL. This is now part of 5.1
    • Multiple read-ahead and write-behind threads – again, allowed to bypass certain bottlenecks, such as read-ahead slots (though apparently it is wiser just to turn off read-ahead entirely) – now part of InnoDB Plugin
    • Multiple reserved SUPER connections – during overloads systems were way more manageable

    Running these changes live have been especially successful (and that was way before Mark/Google released their 5.0 patch set which was then taken in parts by OurDelta/Percona/etc) – and I spent quite some time trying to evangelize these changes to MySQL developers (as I would have loved to see that deployed at our customers, way less work then!). Unfortunately, nobody cared, so running reliable and fast replication environments with mainline MySQL didn’t happen (now one has to use either XtraDB or FB build).

    So, I did some merging work, added few other small fixes and ended up with our 4.0.40 build (also known as four-oh-forever), which still runs half of shards today. It has sufficient in-memory performance for us, it can utilize our disk capacity fully, and it doesn’t have crash history (I used to tell about two 4.0 servers, both whitebox raid0 machines, having unbroken replication session for two years). By todays standards it already misses few things (I miss fast crash recovery mostly, after last full power outage in a datacenter ;-) – and developers would love to abuse SQL features (hehe, recently a read-only subquery locked all rows because of a bug :-) I’m way more conservative when it comes to using certain features live, as when working at MySQL Support I could see all the ways those features break for people, and we used to joke (this one was about RBR :):

    Which is the stable build for feature X? Next one!

    Anyway, even knowing that stuff breaks in one way or another, I was running a 5.1 upgrade project, mostly because of peer pressure (“4.0 haha!”, even though that 4.0 is more modern from operations standpoint).

    As MediaWiki is open-source project, used by many, we already engineer it for wide range of databases – we support MySQL 4.0, we support MySQL 6.0-whatever-is-in-future, and there’s some support for different vendor DBMSes (at various stages – PG, Oracle, MS SQL, etc) – so we can be sure that it works relatively fine on newer versions.

    Upgrade in short:

    • Dump schema/data
    • Load schema on 5.1 instance
    • Adjust schema, as we can do it, set all varchar to varbinary to maintain 4.0 behavior
    • Load data on 5.1 instance
    • Fix MySQL to replicate from 4.0 (stupid breakage for nothing)
    • Switch master to 5.1 instance

    We had some 5.0 and 5.1 replicas running for a while to detect any issues, and as there weren’t too many, the switch could be nearly immediate (English Wikipedia was converted 4.0->5.1 over a weekend).

    I had an engineering effort before to merge Google 5.0 patches into later than 5.0.37 tree, but eventually Mark left Google for Facebook and “Google patch” was abandoned, long live the Facebook patch! :)

    At first FB-internal efforts were to get the 5.0 environment working properly, so 5.1 development was a bit on hold. At that time I cherry-picked some of Percona’s patch work (mostly to get transactional replication for 5.1, as well as fast crash recovery) – and started deploying this new branch. Of course, once Facebook development focus switched to 5.1, maintaining separate branch is becoming less needed – my plan for the future is getting FB build deployed across all shards.

    The beauty of FB-build is that development team is remarkably close to operations (and operations team is close to development), and there is lots of focus on making it do the right stuff (make sure you follow mysql@facebook page). The visibility of systems (PMP!) we have at Facebook can be transformed into code fixes nearly instantly, especially when compared with development cycles outside. I’m sure some of those changes will trickle to other trees eventually, but we have those changes in FB builds already here, and they are state of the art of MySQL performance/operations engineering, while maintain great code quality.

    Yes, at Wikipedia we run a mix of really fresh and also quite old/frozen software (there will be unification, of course), but…. it runs fine. It isn’t as fascinating anymore as years ago, but it allows not paying any attention for years. Which is good, right? Oh, and of course, there’s full data on-disk compatibility with standard InnoDB builds, in case anyone really wants to roll back or switch to the next-best-fork.

    by Domas Mituzas at 25 July, 2010 06:36 PM

    Samuel Klein

    24 July, 2010

    Titoxd

    Sometimes interesting stuff happens


    IMG_9182, originally uploaded by Titoxd.

    I've worked on many current-event articles in Wikipedia; however, I cannot remember when a current event happened close enough to me to actually be able to observe it unfold in real time. That happened on Tuesday when the Town Lake west dam collapsed; while I was not on campus at the time, I did track it as soon as news broke on wiki (although to be honest, I reverted the initial reports since nobody in the traditional media had reported it online at the time).

    I went to the lake to take pictures of the rupture's aftermath, and I've posted all of them on Flickr. I also posted one in the Wikimedia Commons, and to my delight it has been used by other people in the online community. :) So it feels rather different to be "reporting" on news than to be searching for photographs or articles in the news...

    ~~~~

    by Titoxd (noreply@blogger.com) at 24 July, 2010 04:24 AM

    Samuel Klein

    Citizendium: failure to thrive, in search of peace

    After early months of interest and glory — peaking in a spike in mailing list traffic that was moderated for being too active — Citizendium’s growth all but shut down levelled off and has declined steadily since 2008.   Now it is looking for a long-term home.

    I have mixed feelings about Citizendium.  I was excited about it in 2006 — at first blush, it offers a serious alternative for expert editors who want to contribute to free knowledge but feel unappreciated or unwelcome at Wikipedia.  And in general, compatibly-licensed alternatives to Wikipedia are a very good thing – the whole point of using free licenses is to encourage reuse.   But to succeed on the scale of its original dreams, Citizendium must overcome its insularity and make good on its core promise of quality.  Not unlike Wikipedia, it is currently known as much for its humorous highlights as for its best work.  And it faces the same problems with difficult and misguided editors — some who have quite solid credentials — only with a much smaller community to handle that workload.

    I still hope for a proliferation of cousin projects, all competing to find the best way to spur collaboration around free knowledge.  There is so much to explore in the way of how to create welcoming communities for different audiences of writers and creators.  Community atmosphere, and a limitation in the types of knowledge that can be easily shared, are among Wikipedia’s major bottlenecks.   It is welcoming to a narrow[ing?] audience, and if this does not change it may face its own dramatic slowdown in participation – more joyful models are welcome.  (My recent favorite, in style, tools, and atmosphere: fotopedia.)

    The questions that inspire Citizendium remain:  How can we expand collaborative production of educational works to topics that require rare expertise in a field?  How can we verify new works as quickly as they are produced, and how much does this speed depend on the commonality of the knowledge involved?  

    Verification processes are time-consuming, as the slow but steady output of Nupedia, CZ, and even Veropedia show. Since 2006, CZ has produced roughly 150 verified articles (almost triple Nupedia’s output and, well, 150 more than Wikipedia’s own). The featured article and peer review processes on various language Wikipedias are likewise nororiously slow.

    CZ and others try to accomplish this by raising the bar for personal credentials of contributors, and increasing the personal responsibility of a group of meta-editors for the quality of work in a topic.   Some common dilemmas:

    Verifying expertise is difficult without it.
    Experts face demand on their time from many projects.
    Past expertise is no guarantee of future quality of work.
    Professional reputation can be tied to a particular theory.

    And a dilemma special to Wikipedia’s commitment to NPOV: experts often have strong opinions about which theories are right and wrong in their fields.   How can they contribute in peace to a discussion whose end result will not take a position on which is right?

    Two thoughts:

    • One barrier to participation is the qualification expected of reviewers. We could learn much from how Law Reviews are published, I expect, since the field of Law is unique in depending on its students, still pursuing their degrees, to oversee and produce the most distinguished reviewed publications in the field.
    • Another is the inflexibility of a “yes/no” review system.  Less permanent and reversible ways to validate information can be based on guidelines for fine-grained citation and annotation, and a visible place for review and analysis of a text linked prominently from it.  Moreover a review process that formally places works on a spectrum of completion and verification can offer more useful and detailed information than a stamp of approval.

    These ideas draw on work by the current assessment process used widely on the French and English Wikipedias.  I would be interested to hear thoughts from people familiar with law reviews and other large-scale review processes, or with the CZ verification process or that of other educational wikis.

    by metasj at 24 July, 2010 12:59 AM

    Working Wikily

    The “Green Revolution”: a case of being blinded by the new

    “New” is a powerful force for getting people excited. New technologies often create new winners and losers, and in the rush to be a winner, or at least find out who the winners will be, many of us end up getting a little ahead of ourselves. This pattern has been on full display in the past year as the complete story of Iran’s “Green Revolution” has emerged. Many of the most respected media outlets in the U.S. reported on the protests in Iran as being powerfully accelerated by the use of Twitter, and the State Department famously asked Twitter to postpone its scheduled maintenance out of concern that the downtime could hamper the protesters’ coordination. But now it turns out that it just wasn’t so: as documented in Foreign Policy and elsewhere, the tweeting was happening outside of Iran, serving only to spread the news from Iran to the world and hardly—if at all—used by the dissidents themselves. There were those who raised questions at the time. For example, we quoted Gaurav Mishra in an interview with BusinessWeek arguing that Iranians’ use of the tools remains “somewhat limited” but that the story was getting attention because “the international media loves [the] social-networking world.”

    I think this is an important cautionary tale for anyone leading an organization who is trying to answer the often-asked question: “What’s your social media strategy?” The dangerous answer is, as we’ve mentioned here before, “We’re going to get on Twitter, start a Facebook group, and launch a blog.” Those are answers that focus on the tool, not the outcome. We see the same issue when we talk to people about innovation, which is a process that is often mistaken for a goal. New technology is just like a new process: a means to an end that should always be used with care to serve your underlying organizational needs.

    Why is it so common to confuse the two? One reason is the fear of “grasping the nettle,” hoping that the new process or technology will mean that your problem will simply vanish. Another reason, often ignored, is that we in the West tend to have a kind of lightweight messianism in how we see technology. To my eye, part of the reason why the story of the “Twitter revolution” was so easily believed was because many see new technology as simply making the world better. Technology represents progress, an idea that has a special place in American culture, and that feeds into a common attitude that when there’s new technology available that simply using it should solve the problem at hand. But that’s rarely the case and especially rare when the technology is designed to improve the way we interact. The opportunities we see in those technologies for improved relationships, and the ways we grasp those opportunities, have far more impact than the tools’ mere presence. When we fall in love with the newness of a technology, as the media did with Twitter, we can easily be blinded by that unconscious belief that it will simply whisk us off into a brighter world. But of course, new technologies merely open the door. It is up to use to step through.

    by Noah Flower at 24 July, 2010 12:03 AM

    23 July, 2010

    AboutUs

    The Social Web Can Help You Grow

    Sometimes it seems like every business on the planet has a Facebook page or a Twitter feed — or both. But it’s one thing to set these up, and another to use these popular social networking tools effectively.

    Pick up some tips on using these popular social sites from AboutUs community manager Kristina Weis. She shares ideas for what news you can share, and advice for building real relationships with your customers on both Facebook and Twitter.

    AboutUs uses both Facebook and Twitter to communicate what we’re doing, and to listen to what our community members have to say. Our experience has been very positive, and yours can be, too.

    by Aliza Earnshaw at 23 July, 2010 07:15 PM

    Guillaume Paumier

    Wikimedia Multimedia UX testing videos

    Over the past few months, I’ve been coordinating the preparation of a formal User experience (UX) study for the Multimedia usability project. Basically, it means observing how “real” users interact with the Wikimedia Commons in order to improve it. Videos of the testing have now been published in order to share them with the community.

    Observers behind a semi-transparent glass, looking at a user on a computer, guided by a facilitator

    The observation room at the testing facility; the testing is happening in the background, behind the semi-transparent glass (CC-by-nc by Neil Kandalgaonkar).

    Getting there

    We reached out to some UX firms and published a Call for proposals in February. Several firms submitted proposals; after serious consideration, we chose to work with gotomedia, a San Francisco-based firm that seemed to align best with our goals & values.

    The study was planned to take place in March, but was postponed because the prototype was not ready. In the meantime, we asked some of our co-workers to test it in order to uncover the most obvious flaws & bugs.

    Goals & testing conditions

    A few weeks ago, the actual testing eventually took place. We tested ten users: five locally in San Francisco, and five remotely within the US. We considered conducting similar testing abroad, in order to identify language-specific issues; but in the end, it turned out that we wouldn’t learn a lot by simply replicating the same test script.

    Multilingualism on Commons (and Wikimedia websites generally) is a huge piece of work that deserves dedicated efforts, and dedicated UX studies. The main reason for which we decided to hold the testing halfway through the project, and not at its very beginning, was that we could test both the current upload interface, and our prototype.

    On the one hand, during our preliminary research phase, we identified a large number of issues with the current interface; but we still needed to formally record the user experience and validate our preliminary conclusions. On the other hand, we wanted to do a reality-check with our prototype, to see if the direction we had chosen was appropriate, and to identify areas of improvement.

    Highlight videos

    The testing sessions went pretty smoothly. The gotomedia folks did a fantastic job at preparing the “highlight videos” in time for our conferences in Gdańsk (WikiSym & Wikimania). The audiences really liked them, although we didn’t have time to show all of them.

    Highlight videos are edited summaries of the main findings of the study. In our case, we have three highlight videos: one about the testing of the current interface on Commons, one about the testing of the prototype, and the last one about how we could improve the prototype.

    Long story short: the current interface is a nightmare, and the prototype is way better, even if there are some minor things to improve. The good news is, all the items to improve were already planned features at the time of testing, and they have either already been added, or will be before the upload wizard is released.

    Namely, one of the main remaining issues is the fact that users don’t really understand copyright and free licenses. That’s why we’ve been working on a licensing tutorial at the same time, to be released jointly with the new upload wizard.

    See for yourself

    The highlight videos are now available on Wikimedia Commons; per our agreement with gotomedia, all the videos were released under the Creative Commons Attribution – Share alike 3.0 license.

    In the tradition of Wikipedia’s Neutral point of view policy, we’ll try to upload the unedited videos to Commons as well, in order to let the community draw their own conclusions.

    If you would like to draw our attention to things we’ve missed, or even edit your own highlight videos yourself, you are warmly invited to do so. You can watch the highlight videos below (if it works) or on Commons. The links to Commons are available below if you want to download the video files on your computer.

    Your feedback and comments are much welcome.

    Current interface highlight video

    Prototype highlight video

    Room for improvement highlight video

    Files

    by Guillaume Paumier at 23 July, 2010 09:29 AM

    22 July, 2010

    David Gerard

    Nazi Goatse, part 94.

    Wikimedia has set up an investigation into the question of contentious content on the projects. Sexual content, violent content, pictures of Muhammad. The stuff that’s legal, but whose very existence offends people.

    My sympathy goes out to the poor sods charged with the study. I’d be hard put to think of a more poisoned chalice. No matter what they come up with, they will be called Nazis and worse. And whatever they come up with will change no minds whatsoever and be hideously distorted — if they said “the best thing for Wikimedia is a goatse at the top of all pages,” someone would say “yes, and this is why anyone advocating images purporting to be Muhammad should be beheaded.”

    The meta talk page has already been swooped upon by the usual participants and reduced to somewhat worse than uselessness.

    I can reiterate my basic argument, as father of a three-year-old and stepfather of two teenagers.

    The Wikimedia communities are sufficiently painstaking in making sure everything is educational and in context that I’d happily let my daughter in front of Wikimedia unrestricted. Anything sexual or horrifying would be informative and in context.

    The community works incredibly hard to make the contentious stuff good. Any kid who looks up “fuck” on English Wikipedia will come away considerably educated, for example!

    The last shock I got from Wikipedia was when I followed a link on another site to Cock ring, and was confronted with a large, shiny, erect penis. With, of course, a cock ring on it. Not something I’d care to have pop up on the screen at work … on the other hand, I have no reason to be going to an article on cock rings at work. I think the article was entirely reasonable and the use of the picture was entirely reasonable.

    Then there is the issue of important photos of war and so on that are absolutely horrifying. They should be in the encyclopedia, even if merely describing some of them makes my stomach do flip-flops.

    I think experience shows that the Wikimedia communities take their responsibility to educate seriously enough that “Wikipedia is not censored” is sufficient in practice. I have seen no cases that would lead me to think otherwise.

    As noted in the most recent foundation-l reiteration of the Muhammad image discussion, Wikimedia has a firm bias to more information rather than less. It’s right there in the mission statement. Increasing, not decreasing, knowledge is why the community is here at all. If you go against the statement and expectation that more information is better than less information — even if the information is horrible and shocking — the community will not accept it. If the Foundation forces filtering on the community, the community will get up and leave. As Milos Rancic noted, implementing any of the recommendations on that meta talk page will promptly lead to a fork. As it should — insulting your community in such a manner is an excellent way to get rid of them.

    Filtering should be left to third parties. The SOS Children Wikipedia for Schools is an excellent example, and it’s quite popular and won’t get a teacher fired. Other than that, I’ve seen no evidence of actual demand for a filtered Wikimedia from end users — only from people who want to filter the projects themselves at the source.

    One perennial proposal is for images in given categories to be hidden from view for logged-in users. This is an idea I like, as it puts control in the hands of the viewer rather than third parties. All it requires is someone to code something that passes muster with Tim and Domas as unlikely to melt the servers.

    by David Gerard at 22 July, 2010 07:05 PM

    Jeroen De Dauw

    MediaWiki Deployment: Modifying the new installer

    I got a new diagram!!!1!11!! It’s based on my previous one, but slightly more elaborate, and a lot less messy, as I now used Dia to create it :)

    MediaWiki deployment diagram

    Legend:

    • Striped lines: Existing components where code will be copied from, or based upon.
    • Full lines: Components of the complete deployment model.
    • Thick full lines: Core components (of the deployment model) that I definitely want to have completed during GSoC.

    Since my previous post about my Google Summer of Code project I have been poking at the new MediaWiki installer to see what’s there already, how it is there, and how I can integrate it with the above deployment model. I’ve made a bunch of style and documentation improvements while going over the code, and renamed some things to make more sense. And I had Tim Starling clean up a bad svn commit I made :P

    So what I’m doing now is splitting the current ‘Installer’ class, which is part of the new installer, into 2: Installer and CoreInstaller. Installer will hold general installer functionality and be part of the whole deployment model, while CoreInstaller will hold installer functionality specific to core, and will be part of the new installer. After that I can create an initial version of CoreInstallers counterpart: ExtensionInstaller.

    Digg This  Reddit This  Stumble Now!  Buzz This  Vote on DZone  Share on Facebook  Bookmark this on Delicious  Kick It on DotNetKicks.com  Shout it  Share on LinkedIn  Bookmark this on Technorati  Post on Twitter  Google Buzz (aka. Google Reader)  

    by Jeroen De Dauw at 22 July, 2010 05:09 PM

    AboutUs

    Followed Links Return to AboutUs.org

    You might remember our May announcement that we would cease following external links on AboutUs.org, pending a review of our policies and practices. We’re happy to say that we’ve completed the review, and we’re once again selectively following external links on our site.

    As before, the default for our site is not to follow external links until an editorial review by our staff determines that they offer real value to AboutUs.org visitors. We’ve resumed following links on all pages created by community members that we’d previously reviewed and that had followed links.

    As we said in the original blog post, our decision to no-follow stemmed from the confusion we’d caused by following external links in content we’d been paid to create. We had given the impression of selling links, an activity that’s frowned on by Google, and one that we ourselves don’t endorse.

    We still won’t be following links in any content we are paid to create.

    As always, we welcome edits by community members who want to share their knowledge of websites. If you edit the AboutUs domain page for a website you think is great, you may request an editorial review by emailing DoFollow@AboutUs.org. If you’ve added good information to the AboutUs page, one of our staff will review the website you’ve written about. We’ll follow the links if we agree the site is valuable for AboutUs visitors.

    Our goal is to create a place for people who want to find websites on the topics that interest them, and for website owners and operators who want to promote their sites. We made the decision to resume following external links so we could serve you better.

    —-
    Note: Two days ago, community manager Kristina Weis emailed a handful of people who had contacted us personally about the change to no-follow, letting them know that we have returned to following external links after an editorial review. Carter Cole blogged about the change. We’ve heard from a few more people that they’re happy we’re following links again.

    If you have any questions or concerns about link following on AboutUs.org, email Help@AboutUs.org.

    by Aliza Earnshaw at 22 July, 2010 05:00 PM

    EditMe

    Spotlight On: Learn it in 5

    With 75 percent of Millennials, or people ages 18-29, using social media, Learn It In 5's powerful library of how-to videos, produced by technology teachers, helps teachers and students create classroom strategies for today's 21st century's digital classroom. Mark Barnes tells us why he's selected EditMe as the platform for this new community site for teachers in this latest installment of Spotlight On: Learn It In 5.

    22 July, 2010 01:04 PM

    Guillaume Paumier

    Wikimedia at KDE Akademy 2010

    Three weeks ago, I attended the KDE Akademy 2010 conference in Tampere, Finland. My colleague Parul also came along. We gave a talk entitled Wikimedia User Experience programs: lowering the barriers of entry. Basically, we presented the work done as part of the Wikipedia usability initiative, and the Multimedia usability project.

    The KDE logo, followed by I went to Akademy 2010

    and it was fun.

    Shared values & challenges

    It might seem odd for Wikimedia to be presenting at KDE Akademy: Wikimedia is mostly about online content, while KDE is mostly about desktop software. Yet, they share common goals & values.

    On the one hand, a common criticism made against KDE is its feature creep: the tendency to allow for maximum customizability in KDE often comes at the price of simplicity and ease of use.

    On the other hand, MediaWiki, the software on which rely Wikipedia and the other Wikimedia websites, suffers from the same flaws: it has always been “designed” by developers. As a consequence, the interface reflects the implementation model, and often doesn’t match, or even conflicts with, the user’s mental model. The Wikimedia Foundation recently started to include user research and design as part of their development cycle, where user experience is taking a increasingly critical role.

    Our presentation at Akademy was an opportunity to share experience. Both KDE and Wikimedia communities struggle to improve complex interfaces, and both communities have a lot to learn from each other.

    Wikimedia and KDE also have more practical ties: Wikimedia Deutschland e.V. and KDE e.V. used to share an office a few years ago. I’ll take this opportunity to thank Claudia Rauch for inviting us to submit a proposal for Akademy this year.

    Presentation slides & video

    Thanks to KDE e.V. and their awesome volunteers, the full video of our talk (and the follow-up discussion) is available, along with all the other videos, from the Akademy schedule page. A slightly edited version is also available from Wikimedia Commons; you can also download the file to your computer (DownloadOGV, 162 MB). Or, you can watch it below, if it works.

    The presentation slides aren’t very useful alone, but they’re also available on Commons if you want to take a look or watch them alongside the video (DownloadPDF, 2.2 MB).

    Meeting the KDE community

    I’ve had some interaction with the KDE community before. I used to live in the same city as one of the lead KDE developers, and we belonged to the same LUG. I’m also familiar with the digiKam community, with whom I’ve been working on and off.

    Besides our presentation, Akademy was also an opportunity to get together with the “gearheads”, to discuss collaboration opportunities, and of course to get my debugging duck.

    Close-up on a yellow rubber duck, sitting on on a black notebook

    « Take the duck from your desk, look at your code and explain to the duck - line by line - what it does. »

    Working hand in hand

    We had planned to hold a more hands-on workshop to discuss practical common projects between the two KDE & Wikimedia communities. Unfortunately, I had to leave Tampere early to fly to Gdańsk for WikiSym & Wikimania. I didn’t have much time to explore the city either, which is a pity; Tampere is a quaint little city, and the surroundings looked really charming.

    I would still like to work on common projects, as I think there’s a huge potential for a better integration of Wikimedia websites with the desktop. Since I’ve been thinking about this for a while, I have a few ideas of my own: mass upload tool, offline wiki editor, desktop widgets (e.g. for Wiktionary, FAOTD, POTD), application plugins (e.g. to find media files from Commons from within an application), instant messaging with other Wikimedia editors, etc. That said, I would also like to collect ideas & feedback.

    So, what Wikimedia content would you like to access from your desktop? For what use? What desktop tool would facilitate your editing or reading of Wikimedia projects?

    by Guillaume Paumier at 22 July, 2010 09:28 AM

    OmegaWiki

    So we like #Commons

    Yesterday Kipcool surprised us with a more visible link to Wikipedia, today he added Commons to it as well. Commons is essentially different in that there is only one link to Commons per concept.

    When you look for pictures of a horse at Google or Bing, it makes a big difference in what language you are looking for that animal. If you look for instance for an "សេះ" you will find far fewer horses.


    This is what it looks like in Arabic. As there is an annotation referring to the Arabic Wikipedia article, the Arabic article is selected.
    Thanks,
          GerardM

    by noreply@blogger.com (GerardM) at 22 July, 2010 12:15 AM

    21 July, 2010

    EditMe

    Ways to Wiki: Create Happy Clients

    Email is not an effective medium to run an entire business or consulting project. Here are three reasons why using collaboration sites as client portals can really supercharge your consulting efforts and create happy clients.

    21 July, 2010 04:47 PM

    20 July, 2010

    OmegaWiki

    So we like #Wikipedia ...

    Every now and then, I am happily surprised with new functionality for OmegaWiki. This time Kipcool made our existing Wikipedia visible. When there are references to a Wikipedia article, he will point you to the article in your language.


    The expression dólar shows the Dutch Wikipedia article for me. It will show the Spanish article for Ascander and the English article for Kipcool (the French and German article are not linked yet).

    The link is added to the page with Javascript. This prevents additional load on our server. I hope you like it, it is an other excellent reason to dig into our annotions.
    Thanks,
          GerardM

    by noreply@blogger.com (GerardM) at 20 July, 2010 09:46 PM

    AboutUs

    Win New Business with Email

    If you own a business, it’s smart to send emails to customers and prospects who have opted in to receiving them. Sharing product updates and company news can bring people back to your website. Do it right, and they might even mention you to their friends.

    AboutUs community manager Kristina Weis sends an email newsletter to our community members whenever we have something interesting to share. Her new article, Tips for Sending Email Newsletters, offers her hard-won wisdom about how to get the best results from your email marketing efforts.

    In fact, Kristina is sending out the latest AboutUs newsletter this week. If you’re already on the list, you’ll receive it soon. If you aren’t, you can easily sign up for all our news and tips.

    Have you learned something important about promoting your business on the web? Would you like to author an article on AboutUs.org? If so, contact Aliza@AboutUs.org.

    by Aliza Earnshaw at 20 July, 2010 07:11 PM

    User:Bawolff

    image metadata

    I thought I'd write a blog post about my google summer of code project. I've never been much of a blogger, but I see lots of my fellow gsoc'ers blogging, so I thought I'd write a post. My project is to try to improve mediawiki's support for image metadata. Currently mediawiki will extract metadata from an image, and put a little table at the bottom of the image page detailing all the metadata (for example, see http://commons.wikimedia.org/wiki/File:%C3%89cole_militaire_2545x809.jpg#metadata ).

    However this is far from all the metadata embedded in an image. In fact mediawiki currently only extracts Exif metadata. Exif metadata is arguably the most popular form of metadata, so if you're going to only extract one, Exif is a good choice. Every time you take a picture with your digital camera, it adds exif data to your picture. Most of this type of data is technical - fNumber, shutter speed, camera model, etc. You can also encode things like Artist, copyright, image description in exif, however that is much more rare.

    What I'm doing is first of all fixing up the exif support a little bit. Currently some of the exif tags are not supported (Bug 13172). Most of these are fairly obscure tags no one really cares about, but there are some exceptions like GPSLatitude, GPSLongitude, and UserComment.

    I'm also (among other things) adding support for iptc-iim tags. IPTC-IIM is a very old format for transmitting news stories between news agencies. Adobe adopted parts of this format to use for embedding metadata in jpeg files with photoshop. Now a days its being slowly replaced by XMP, but many photos still use it. IPTC metadata tends to be more descriptive (stuff like title, author, etc) in nature compared to how exif metadata is technical (aperature, shutter speed) in nature.

    My code will also try to sort out conflicts. Sometimes there are conflicting values in the different metadata formats. If an image has two different descriptions in the exif and iptc data, which should be displayed? Exif, IPTC, or both? Luckily for me, several companies involved in images got together and thought long and hard about that issue. They then produced a standard for how to act if there is a conflict [1]. For example If both iptc and exif data conflict on the image description, then the exif data wins.



    Consider [[File:2005-09-17 10-01 Provence 641 St Rémy-de-Provence - Glanum.jpg]]

    On commons the metadata table looks like:



    But on my test wiki the table looks like:

    Camera manufacturerCASIO COMPUTER CO.,LTD
    Camera modelEX-Z55
    Exposure time1/800 sec (0.00125)
    F Numberf/4.3
    Date and time of data generation14:21, 28 September 2005
    Lens focal length5.8 mm
    Latitude43° 46′ 21.35″ N
    Longitude4° 50′ 1.34″ E
    OrientationNormal
    Horizontal resolution72 dpi
    Vertical resolution72 dpi
    Software usedMicrosoft Pro Photo Tools
    File change date and time14:21, 28 September 2005
    Y and C positioningCentered
    Exposure ProgramNormal program
    Exif version2.21
    Date and time of digitizing14:21, 28 September 2005
    Meaning of each component
    1. Y
    2. Cb
    3. Cr
    4. does not exist
    Image compression mode3.66666666667
    Exposure bias0
    Maximum land aperture2.8
    Metering modePattern
    Light sourceUnknown
    FlashFlash did not fire, compulsory flash suppression
    Supported Flashpix version0,100
    Color spacesRGB
    File sourceDSC
    Custom image processingNormal process
    Exposure modeAuto exposure
    White balanceAuto white balance
    Focal length in 35 mm film35
    Scene capture typeStandard
    Scene controlNone
    ContrastNormal
    SaturationNormal
    SharpnessNormal


    Most notably, GPS information is now supported. As a note, the wikipedia links for camera model are a commons customization, which is why they don't appear on my test output.

    As another example, consider [[file:Pöstlingbahn TFXV.jpg]]. On commons, it has no metadata extracted. (It does have some information about the image on the page, but this was all hand-entered by a human). On my test wiki, the following metadata table is generated:



    I'm almost done with iim metadata, and plan to start working on XMP metadata soon. If your curious, all the code is currently in the img_metadata branch. You can also look at the status page which I will try to update occasionally.

    Cheers,
    Bawolff

    by Bawolff (noreply@blogger.com) at 20 July, 2010 07:38 PM

    Jeroen De Dauw

    MediaWiki.org user page 1 year

    Today my MediaWiki.org user page is one year old – I created the first version on July 20, 2009. With my SVN account also approaching it’s first birthday, I can now say I’m doing MediaWiki development for a year. A lot has happened in this year.

    I created the Maps and Semantic Maps extensions, and have continues releasing big and small updates the whole year long. At the end of 2009 I created the Validator extension to facilitate parameter handling in Maps and Semantic Maps. In early 2010 I was contracted by the Wikimedia Foundation to create the Storyboard extension, and by the Karlsruhe Institute of Technology to do work on Semantic MediaWiki. In May 2010 I started working on my Google Summer of Code 2010 project to create an extension management platform for MediaWiki. In between all these things I made various contributions to other extensions, including Semantic Forms, Semantic Internal Objects, Page Object Model, Semantic Compound Queries, Semantic Result Formats and Approved Revisions.

    Next to all the code I created and released, I also attended several events and gave a number of presentations. These events include SMWCamp 2009 in Karlsruhe, the Berlin developers workshop in April and Wikimania 2010 in Gdansk.

    I’m currently all time MediaWiki comitter #18, with 1080 commits. Looking forward to all the awesome stuff I can do in the coming year :)

    Digg This  Reddit This  Stumble Now!  Buzz This  Vote on DZone  Share on Facebook  Bookmark this on Delicious  Kick It on DotNetKicks.com  Shout it  Share on LinkedIn  Bookmark this on Technorati  Post on Twitter  Google Buzz (aka. Google Reader)  

    by Jeroen De Dauw at 20 July, 2010 01:57 AM

    19 July, 2010

    Samuel Klein

    Hamming on doing great work, in any field

    Read this transcript of a great public speech about how to do great science, by R. W. Hamming. This sort of good advice is timeless… as are many of his works.

    by metasj at 19 July, 2010 11:43 PM

    Wikipedia Signpost

    18 July, 2010

    Jeroen De Dauw

    MediaWiki testing with PHPUnit

    I figured having some unit tests for Maps, the MediaWiki extension to work with geographical data and display it by embedding dynamic maps into your articles, would be beneficial to it’s quality. It’s pretty hard to try cover all possible use cases with manual tests, and consumes a lot of time in any case. I therefore decided to try create some tests for the coordinate parser and formatter class, as it’s arguably the core feature of Maps.

    PEAR logoI started off by trying to install plain PHPUnit, which is the most commonly used unit testing framework for PHP. This took me a while, as you are supposed to install it using PEAR (PHP Extension and Application Repository), a repository tool for PHP applications, and has never used this before. After two hours or so of messing around, I got both installed :) Then I went on investigating how I could best integrate this into my work-flow, and discovered that PHPUnit comes bundled with Zend Studio, seamlessly integrated, working completely out of the box o_O.

    I then wrote a test case for the coordinate parsing and formatting class of Maps. I had a hard time getting it to work, as I needed to include MW itself, as the class uses MW functions. After some non-constructive discussion with several fellow MW devs I found a way to get it to work by including the maintenance script entry point, and tricking MW into thinking the call was made from a CLI. I now have a test case for the coordinate class, with tests for most of it’s functionality. Some more test data, and maybe some extra tests would be nice. A tricky thing in the case of this class is founding errors, which are hard to take into account, especially if you only want to allow them to a certain degree.

    PHPUnit logoThis particular test case is already paying off, as it made me find 3 subtle errors in coordinate parsing or formatting, that did not show up in my manual tests, as I was not covering the test data causing the issues.

    I’m now planning to maybe write test cases for the distance parser to, which should be rather easy to do. I probably won’t create any others for Maps, as it’s rather time consuming, and I have a lot of other things to do right now. When I create new classes that are suited for unit tests in the future, I’ll definitely write tests for them as I build them up though, as it’ll not cost a lot more time then doing manual tests, and will ensure the classes are really solid.

    PHPUnit integration with Zend Studio

    Digg This  Reddit This  Stumble Now!  Buzz This  Vote on DZone  Share on Facebook  Bookmark this on Delicious  Kick It on DotNetKicks.com  Shout it  Share on LinkedIn  Bookmark this on Technorati  Post on Twitter  Google Buzz (aka. Google Reader)  

    by Jeroen De Dauw at 18 July, 2010 10:26 PM

    Ziko van Dijk

    Ziko

    On Sunday (July 18th), a major German inner city highway became a piece of community art. The Ruhrschnellweg A40 from Duisburg to Dortmund was inhabited by countless organizations and citizens who presented themselves. And the “Wikipedia Stammtisch Ruhrgebiet” was part of the action. Thanks to Wikimedia Deutschland and Pedia-Press, but above all to our main organiser Benutzer:Wuselig! (pictures)


    by Ziko van Dijk at 18 July, 2010 08:54 PM