January 13, 2005
Distributed ClassificationUlises Ali Mejias has just published a paper on distributed classification, also known as free tagging, open tagging, ethnoclassification, folksonomy, and faceted hierarchy, and associated with such services as flickr, del.icio.us and furl. Mejias' main focus is on how users perceive these systems, and how they interact with each other through them. Useful literature review of a nascent field of research.
January 10, 2005
The Importance of Being PermanentWithout permanence you slip off the search engines. Without permanence, bold ideas like 'news as conversation' fall away, because you're shutting down the conversation before it has barely started. Without permanence, you might be on the web, but you're certainly not part of it.
From The Importance of Being Permanent, a PressThink article by Simon Waldman, the Director of Digital Publishing for The Guardian Newspapers.
November 08, 2004
Is It Time for a Moratorium on Metadata?
- Issue a joint proclamation that the DCMI, MPEG-7 and Semantic Web initiatives are all Official Successes and are Ready for Business.
- Issue a second proclamation calling for a general moratorium on metadata.
- Concentrate on locating objects within a range of mixed-media assets based on context-sensitive queries.
- Ask public-spirited citizens worldwide to contribute their favorite photos, audio fragments, or personal videos to create a culturally diverse corpus of 1 million nontext media assets.
- Embark on a multimedia content differentiation competition that will allow a comprehensive but limited set of objects to be identified: people, places, objects, and life events (births, weddings, deaths, and so on). The catch: Any contributed techniques must apply to multiple encoding formats (pictures, video, audio), and it must include a user interface for managing media classification.
October 25, 2004
Are Data and Metadata Interchangeable?
David Weinberger writes:
"So, in the Third Age of Order, all data is metadata. Contents are labels. Data is all surface and no insides. It's all handles and no suitcase. It's a folder whose content is just another label. It's all sticker and no bumper."
Or maybe, since all data is metadata, metadata has become meaningless. The ability the desire to search every aspect of the data in context hasn't made the data less substantive handle, label, sticker but questions the need for metadata itself.
October 05, 2004
Intellectual Property Rights (IPR) in Networked E-Learning
Copied from elearnspace:
Intellectual Property Rights (IPR) in Networked E-Learning: "This guide aims to provide a user-friendly introduction to IPR issues for e-learning content developers and managers. It is intended to act as a point of entry to the field of IPR in e-learning that will provide a good foundation for building expertise in the e-learning developer community. It deals with the basic aspects of IPR, especially copyright, in e-learning content development, with an emphasis on reusing third party materials to create new resources."
September 29, 2004
Emerging Face of Information SearchNew from K-Praxis: Emerging Face of Information Search: Initial focus on two key issues: (1) How well does a search engine understand the users’ intention? and (2) What are the challenges and questions that arise while interpreting the users’ intent?
September 24, 2004
Refining the Search Engine
An interview wth Ramesh Jain from Georgia Tech. "Current search engines like Google do not give me a 'steering wheel' for searching the Internet.....The search engines get faster and faster, but they're not giving me any control mechanism. The only control mechanism, which is also a stateless control mechanism, asks the searcher to put in keywords, and if I put in keywords I get this huge monstrous list. I have no idea how to refine this list. The only way is to come up with a completely new keyword list. I also don't know what to do with the 8 million results that Google threw at me. So when I am trying to come up with those keywords, I don't know really where I am. That means I cannot control that list very easily because I don't have a holistic picture of that list. That's very important. When I get these results, how do I get some kind of holistic representation of what these results are, how they are distributed among different dimensions."
August 24, 2004
Congressional Budget Office Report: Copyright Issues in Digital Media
The Congressional Budget Office has just released a report, Copyright Issues in Digital Media. According to the report's summary, Congress has three options with regard to the current situation regarding copyright:
- Congress would do nothing and allow market forces to work ("forebearance"). This option would depend on the effective development and implementation of DRM technologies.
- Congress would be to use compulsory licensing to set a price for certain types of creative works. E.g., imposing a tax on computers and using that revenue to may royalties to copyright holders.
- Congress would be to revise copyright law in favor of one of the groups whose interests are at stake in the copyright debate: the copyright owners or the users of copyrighted material. 'Allowing copyright owners to have too much control could exacerbate the compromised efficiency that some differential pricing schemes can create in the presence of weak competitive pressures......Revising copyright law in favor of consumers, in contrast, could lead to inefficiency by making differential pricing less feasible."
August 02, 2004
Four Reasons to be Happy about Internet Plagiarism
Just got around to this year-old article, Four Reasons to be Happy about Internet Plagiarism. They are:
- The institutional rhetorical writing environment (the "research paper," the "literary essay," the "term paper") is challenged by this, and that's a good thing.
- The institutional structures around grades and certification are challenged by this, and that's a good thing.
- The model of knowledge held by almost all students, and by many faculty -- the tacit assumption that knowledge is stored information and that skills are isolated, asocial faculties -- is challenged by this, and that's a good thing.
- But there's a reason to welcome this challenge that's far more important than any of these -- more important, even, than the way the revolutionary volatility of text mediated by photocopying and electronic files have assaulted traditional assumptions of intellectual property and copyright by distributing the power to copy beyond those who have the right to copy. It's this: by facing this challenge we will be forced to help our students learn what I believe to be the most important thing they can learn at university: just how the intellectual enterprise of scholarship and research really works.
A video of Lawrence's Lessig's presentation, The Future of Copyright, Culture and Creativity, delivered on 21 May 2004 in Helsinki.
July 04, 2004
Integrating Metadata with the Desktop
There is an interesting article over at the The Useful Information Company describing ways in which the integration of metadata with the desktop may be an inevitable and useful direction. The author argues that as the amount of information stored on desktops rapidly grows, new and innovative ways must be used to locate and interact with that data (down with inefficient hierarchical trees!). As a solution, he suggests using RDF as a potential model. Along with describing the problem, the article also provides links to some nifty RDF toolkits. Here is the Slashdot discussion on the article.
July 02, 2004
Inter-American Workshop on Access to Environmental Data
A summary of an Inter-American workshop on access to environmental data, just released, provides a good survey of regional and global initiatives and scientific, technical, policy and institutional issues. "Scientists in many Latin American countries already have significant capabilities and data resources that would be of benefit to North American researchers through increased collaboration. Latin American researchers similarly would be afforded new or enhanced capacity-building opportunities and greater exposure to North American data management principles and know-how of direct relevance to their activities."
June 29, 2004
New Institutional Repository Software: Digital Commons @
Federal Depository Library Program "Broken"
"The Federal Depository Library Program has fallen behind in cataloging and preserving access to government documents published only on the Web. As a result, public access to those publications is spotty at best." [More]
Interview with Tim Brody on Digitometrics
Sara Kjellberg interviews Tim Brody on Digitometrics, which combines the results from citation analysis with web logs to rate individual articles. Brody is the creator of Citebase Search, which enhances OAI-harvested metadata with linked references harvested from the full-text [in this case, from arXiv, the physics subject archive] to provide a web service for citation navigation and research impact analysis. Citebase is an experimental service which grew out of the Open Citation Project. [More]
June 07, 2004
Elsevier Allows Open Access Self-ArchivingFrom Information Today:
"In a move that has stunned both the publishing community and the academic world, major journal publisher Elsevier is going to permit Open Access self-archiving for almost all of its journal titles. Under the new policy it will permit authors to self-archive their materials. This move will not change Elsevier’s subscription model for funding.
"'An author may post his version of the final paper on his personal Web site and on his institution's Web site (including its institutional repository). Each posting should include the article's citation and a link to the journal's home page (or the article's DOI),' stated Karen Hunter, Elsevier vice president for strategy. 'The author does not need our permission to do this, but any other posting (e.g., to a repository elsewhere) would require our permission. By his version we are referring to his Word or Tex file, not a PDF or HTML downloaded from ScienceDirect—but the author can update his version to reflect changes made during the refereeing and editing process.'"
May 27, 2004
Rights Expression Languages (RELs)
Karen Coyle has just published a report on Rights Expression Languages commissioned by the Library of Congress. The report provides an analysis of a representative sample of RELs, including CreativeCommons, METSRights, Open Digital Rights Language (ODRL), and MPEG-21, Part 5 (MPEG-21/5).
May 25, 2004
Contract, Copyright and the Future of Digital Preservation
From the LibraryLaw Blog: Alicia Ryan has published a new law review note in the Boston University School of Law's Journal of Science and Technology. Ryan proposes that three rights be reserved for libraries and archives: the right to copy and preserve the Web; the right to copy and preserve any digital work found to be endangered; and the right to lend these digital works once they have bcome commercially unavailable for five years.
As information retrieval tools and methodologies become more and more refined, delivering more and more accurate results, I grow more and more worried. Worried that we are eliminating all chance and coincidence from our online lives. Worried that we will end up learning only what we already know. Worried that my carefully constructed virtual world will end up a stagnant pond. Back in the real world, if your circle of friends is wide enough, or merely sufficiently awake to their surroundings, you'll come away knowing something new with every interaction. Mechanisms such as collaborative filtering may go a long way to re-inserting back into the mix the serendipitous or, as Donald Rumsfeld quite eloquently put it, "what we don't know we don't know."
That's probably a needlessly pompous introduction to Audioscrobbler. Audioscrobbler uses a plug-in to track what you're listening to, creates a playlist for you, and compares your listening pleasures with other Audioscrobblers.
I heard about Audioscrobbler from the Dan Hill's cityofsound blog. Hill points out that Audioscrobbler could be made even more useful if listeners could be weighted and recommendations evaluated based on whether I trusted this person's judgment, or whether the listener was just some moron who stumbled across something I happened to like too. He also makes the great suggestion that we need to be able to use Audioscrobbler with our iPods, where most of the listening takes place. To this I would also add one other suggested improvement. I often end up listening to WFMU's MP3 stream, rather than my own collection of songs. Almost all of the programs on WFMU now have automated playlists.....Wouldn't it be great if Audioscrobbler could track automated playlists from online radio stations? Wouldn't it be great to have Arbitron-like numbers for people who listened to, and enjoyed, music?
One more point Hill makes, that bears repeating. He talks about suddenly becoming conscious of what he was listening to, and playing his favorite songs for Audioscrobbler. Back when we all wanted our friends to understand what made us tick; now we've shifted that same anxiety to our software.
May 24, 2004
New IR Software: Archimede
The following comes directly from the Archimede website:
Laval University Library recently launched the third component of its institutional repository. Called «Archimede» (http://archimede.bibl.ulaval.ca), this component covers e-prints, pre-prints, post-prints and other research publications from faculty members and research communities.
Following a thorough analysis of available software solutions, including E-prints and D-Space, the library decided to develop its own customized application. Inspired by the D-Space model, Archimede is arranged around research communities and fully developed in open source. The system is OAI compliant, using a Dublin Core metadata set. An open source distribution of Archimede will be available soon.
Following are some highlights of the special features and characteristics of the system:
- Archimede has been developed in a multilingual perspective, with internationalization as a focus. Using the open source standard (i18n), the text (or content) of the interface is independent and not embedded in the code. It is then relatively easy to develop an interface in a specific language without having to work on the code itself. English, French and Spanish interfaces are already offered in Archimede. That feature allows also the user to switch easily from language to language anywhere and anytime during his search and retrieval process.
- Archimede is flexible and not dependent on a specific platform. The system can be installed on Linux as well as on Windows. For a library wishing to implement the solution, the system can be easily adapted to the technical infrastructure already in place, thus increasing the efficiency of the implementation process.
- Archimede allows searching on metadata as well as on the full text, thus enhancing the power of the search engine. An application is being developed that will automatically generate and translate from the text and the abstract a proposed set of controlled vocabulary subject headings. This will be done through the « Répertoire des vedettes-matières de la Bibliothèque de l'Université Laval » and its links to LC Subject headings, Canadian Subject headings, Mesh hand AAT.
- The search engine is based on open source Lucene, using LIUS (Lucene Index Update and Search), a customized framework developed at Laval by the library staff. LIUS allows indexing of different types of documents formats : XML, HTML, PDF, RTF, MS Word, MS Excel, JavaBeans; it also permits mixed indexing, integrating for example in the same occurrence metadata in XML and full text in PDF, HTML, etc.
May 20, 2004
Institutional Repositories in the Context of Digital Preservation
From Open Access News:
Paul Wheatley, Institutional Repositories in the Context of Digital Preservation, Digital Preservation Coalition, Technology Watch Series Report 04-02, 2004. Excerpt: "The key recommendations from this report are for the continued development of specific requirements for trusted digital repositories, and also for the creation of independent certification services for digital repositories that will evaluate how repositories meet these requirements. A clearer picture can then be presented as to how well institutional repository software, as well as specific digital repositories, can deliver effective digital preservation."
Anarchist in the Library
Lifted this morning from boingboing:
I've just finished reading Siva Vaidhyanathan's excellent new book The Anarchist in the Library, a discourse on the real culture war: the fight between open systems for exchanging knowledge and closed systems that see knowledge as a marketable commodity. The best part of this book is that it repudiates technology as a tool for making policy, calling for deliberation instead: in other words, copyright strictures should be created by courts and lawmakers, not DRM.Both visions of the perfect library -- utopian [all knowledge available for free, organized by volunteers] and dystopian [child-porn, spoilers and amateurish information supplanting high-quality research] -- are overstated. We are not close to constructing the perfect library, but we can imagine how it might look and act. Many of our communal efforts since the early 1990s seem to be moving our information ecosystem toward that vision. Yet long before we ge there, many are sounding alarms about the ways people might abuse their freedoms to use and move information. Even though the perfect library is not imminent, many are acting as if it is. The strong reactions of those who would squelch these freedoms might render our information systems unable to perform the positive functions of the perfect library because of the unexamined -- often merely assumed -- threats to the status quo. The closer we get to the perfect library the more the oligarchs undermine it.
May 13, 2004
Natural Language Processing/Information Retrieval Software Repository
Also thanks to Marcus Zillman, I have discovered the Natural Language Processing/Information Retrieval Software Repository at the School of Computing, National University of Singapore. The software is intended for use by the students and researchers there, but is available to anyone.
The Academic Web Link Database Project
Taken verbatim from Marcus Zillman's blog:
The Academic Web Link Database Project
The Academic Web Link Database Project makes available databases of academic web links to the world research community. This project was created in response to the need for research into web links: including web link mining, and the creation of link metrics. It is aimed at providing the raw data and software for researchers to analyse link structures without having to rely upon commercial search engines, and without having to run their own web crawler. You may use all of the resources on this site for non-commercial reasons provided that you notify them if you have an academic paper or book published that uses the data in any way (so that they know the site is getting good use).
Information Cannot Be Owned
A paper just released by Jean Nicolas Druey through the Berkman Center for Internet & Society at Harvard Law School. "The sum of these considerations is that information is not an object for ownership. It should not be and it cannot be. It should not, because communication, being the exchange of information between persons, is an act occurring among these persons and is therefore determined by them, and constitutes one of the highest social values. And it cannot, unless being arbitrary, because it is not possible to form information units by cutting them out from their context with other information (horizontal aspect) or their ties to previous and subsequent information (vertical aspect)."
May 06, 2004
Practical Strategies for Filling Institutional Repositories
The April issue of Ariadne has an article by Morag Mackle, Filling Institutional Repositories: Practical strategies from the DAEDALUS Project. Pretty depressing reading. All of the strategies they attempted schmoozing prominent academics, schmoozing faculty interested in open access issues, trawling departmental websites, identifying open access journals and searching for any Glasgow faculty who had published there required enormous amounts of time, energy and resources, and yielded precious little in the way of results. "Change is only likely to happen if staff are required, either by the funding councils or by their institution, to make their publications available either by publishing in open access journals or in journals that permit deposit in a repository."
Open Access Movement
I've just started reading Peter Suber's blog on open access
"The open access movement: Putting peer-reviewed scientific and scholarly literature on the internet. Making it available free of charge and free of most copyright and licensing restrictions. Removing the barriers to serious research"
May 05, 2004
Originally posted on Marcus Zillman's blog:
"The OpenNet Initiative is a University-based policy research project documenting filtering and surveillance practices worldwide. Our aim is to excavate, expose and analyze these practices in a credible and non-partisan fashion to uncover the potential pitfalls of present policies to explore the possibility of unintended and unexpected consequences and thus to help inform better public policy and advocacy work in this area. To achieve these aims, the ONI employs a unique multi-disciplinary approach that includes: Advanced Technical Means using a suite of sophisticated network interrogation tools and metrics; and Local Knowledge Expertise through a global network of regionally based researchers and experts. OpenNet Initiative research will be published on this website in a series of national and regional case studies, occasional papers, and bulletins.
"As part of its work, the OpenNet Initiative also operates a 'clearinghouse' for circumvention technologies that assess and evaluate systems intended to let users bypass filtering and surveillance. We also actively develop circumvention technologies in-house as a means to explore the limitations of filtration and counter-filtration practices.
"The OpenNet Initiative is a collaborative partnership between three leading academic institutions: the Citizen Lab at the Munk Centre for International Studies, University of Toronto, Berkman Center for Internet & Society at Harvard Law School, and the Advanced Network Research Group at the Programme for Security in International Society (Centre for International Studies) at the University of Cambridge. "
May 04, 2004
Institutional Repositories: An Overview
Miriam Drake, the former Dean of Libraries and currently Professor Emerita at Georgia Tech, has just published a succinct overview on institutional repositories. Covered are policies, legal considerations, standards, sustainability and funding, as well as examples.
Sustaining Digital Scholarly Resources
First Monday has just published selected papers (with video) from the Fifth Annual Conference on Libraries and Museums in the Digital World, sponsored by the U.S. Institute of Museum and Library Services and the University of Illinois at Chicago (3–5 March 2004). The first paper that jumped out at me was Don Waters' Building on Success, Forging New Ground: The Question of Sustainability.
This paper focuses on three factors that contribute to the sustainability of digital scholarly resources. First, the development of such resources depends on a clear definition of the audience and the needs of users. Second, the resource must be designed to take advantage of economies of scale. Third, to create an enduring resource, careful attention is needed to the design of the organization that will manage the resource over time.
As usual, I went straight to the section on structural impediments...er....Waters calls it "Organizational Design." Here's the quote I'll be borrowing:
"..... the huge economies of scale that are possible with digital databases are difficult to manage over current institutional boundaries. Much as they might like in principle to do so, few academic institutions, large or small, are actually endowed with the mission, leadership, accountability, support structures, and other organizational apparatus to serve up collections to scholars worldwide."
Latent Semantic Indexing
Clara Yu, John Cuadrado, Maciej Ceglowski and J. Scott Payne. Patterns in Unstructured Data: Discovery, Aggregation, and Visualization. A Presentation to the Andrew W. Mellon Foundation. 2002.
"In talking about search engines and how to improve them, it helps to remember what distinguishes a useful search from a fruitless one. To be truly useful, there are generally three things we want from a search engine:
"Improving our trinity of precision, ranking and recall, however, requires more than brute force. In the following pages, we will describe one promising approach, called latent semantic indexing, that lets us make improvements in all three categories. LSI was first developed at Bellcore in the late 1980's, and is the object of active research, but is surprisingly little-known outside the information retrieval community.
- We want it to give us all of the relevant information available on our topic.
- We want it to give us only information that is relevant to our search.
- We want the information ordered in some meaningful way, so that we see the most relevant results first.
"Latent semantic indexing adds an important step to the document indexing process. In addition to recording which keywords a document contains, the method examines the document collection as a whole, to see which other documents contain some of those same words. LSI considers documents that have many words in common to be semantically close, and ones with few words in common to be semantically distant... Although the LSI algorithm doesn't understand anything about what the words mean, the patterns it notices can make it seem astonishingly intelligent.
"When you search an LSI-indexed database, the search engine looks at similarity values it has calculated for every content word, and returns the documents that it thinks best fit the query. Because two documents may be semantically very close even if they do not share a particular keyword, LSI does not require an exact match to return useful results. Where a plain keyword search will fail if there is no exact match, LSI will often return relevant documents that don't contain the keyword at all."
Delivering Classics Resources with TEI-XML, Open Source, and Creative Commons Licenses
The name says it all. This page describes the new initiative:
The Center for Hellenic Studies of Harvard University has adopted an innovative technological program for free online publication of books, articles, and databases designed to make resources in the classics more visible and accessible.I must say, this is tremendously exciting news for the Humanities, Classics, e-learning, and anyone interested in innovative initiatives to share data.
May 03, 2004
More on Cooperation between IT Core Services and Libraries
Also in the May issue of Syllabus Magazine is Paul Conway's case study of Duke University's Digital Library initiative. The quote from Deep Infrastructure Supports Digital Library Services that caught my eye: "The digital library program at Duke is not an isolated 'free agent' on campus but is closely allied with central IT operations and with technology activities based in professional schools and academic departments. The relationship between the library IT operation and the central Office of Information Technology (OIT) can best be characterized as highly collaborative and collegial in that special southern way. [??!!] The resources that OIT marshals dwarf those of the library. The library’s digital initiatives have, as a result, emphasized a principled division of labor that builds on the library’s traditional intellectual strengths: structured organization of information resources, a deep commitment to preservation, and mediation in the search and retrieval process."
Learning Object Repositories and Digital Repositories
Learning Object Repositories, Digital Repositories, and the Reusable Life of Course Content by Philip Long has just been published in the May issue of Syllabus magazine. "What do learners need? They should be able to draw on digital assets from any resource, or repository, that strikes them as useful—even if the rationale is serendipity—at the exact moment when the learning activity calls for it. Today they can’t do that."
In Search of Searches Past
In an interview on National Public Radio this morning, the novelist and poet Sandra Cisneros recalled how, as a young girl, she would search out the most darkened and dog-eared cards in the card catalog at her local branch of the Chicago Public Library, and read as many of these books as possible, since they were obviously popular (and therefore, good). This tactile memory triggered in me one of those madeleine moments to which the senescent so often fall prey. I found myself back thirty years ago in the card catalog room of the New York Public Library, where moving from the tray containing Aar-Abe to the tray containing Luv-Mab gave you a palpable sense of the vast physical holdings crammed below and around you. I remembered how often I would come across, among the machine-made cards, a lined index card from the previous century, with a handwritten script today only available for special occasions and at great expense. (That card catalog was chopped up into end tables less than ten years later and given to those who had made a substantial contribution to the NYPL. I remember the first time I saw one of the tables in someone’s home; I think my expression was probably the same as someone seeing their first shrunken head.)
I also recalled tracking down a copy of Velleius Paterculus in the stacks of Butler Library at Columbia University. Their one copy had been printed in Amsterdam in the late 17th century, and still bore the King’s College stamp. In the back was the sign-out card, which traced decades of use by scholars, some of whom were unknown, a few not only known, but already absorbed into what we would have referred to back then as the warp and woof of history. (Perhaps this entry is also the story of the search for unpopular books.)
Of course, even thirty years ago, technological change was barreling down on us at an alarming pace. At Saint Louis University in 1973, we could browse the Vatican Library on microfilm, although I don’t recall too many finding aids; perhaps access has improved since then, perhaps not. Those handwritten index cards had been curiosities for some time. While I might have added my name to the list on the sign-out card for Velleius Paterculus, more likely even then my ID was optically scanned and no physical artifact remains; just my memory. And all the end tables in the world would not persuade me to reverse time. I sympathize with Nicholson Baker’s seemingly inconsolable anger, but it’s not mine. My life is undeniably better now than it was thirty years ago when I first fell in love with the doing of scholarly research.
But Cisneros did make me think about what is lost. Thirty years ago, searching engaged our senses and created memories. Searching was a physical activity, taking place in a specific time and place. We tripped over, time and time again, serendipitously, human remains. I still recall some of what I learned back then, but I also remember the learning long, lazy summer days in the Main Reading Room at the NYPL, the soft whirr of the fans, music floating up and through the windows from Bryant Park, waiting for books to be sent up. Even if Google gives me the information I need, and when I need it, what will I remember? Will I be happy?
April 27, 2004
Metadata Quality in e-Learning: Garbage In - Garbage Out?
We've been talking quite a bit here yelling, screaming, kicking and biting about who is best equipped to manage the creation of metadata; as if any sane person really wants that job! Sarah Currier has recently published an article on the cetis website, Metadata Quality in e-Learning: Garbage In - Garbage Out?, which discusses the collaborative mode where metadata is jointly created by the educational practitioner and information scientist, and the strengths and weaknesses of such a model. She includes this uplifting quote from Simon Pockley's Metadata and the arts: the art of metadata:
"Just as the production of feature films has been characterized by the concept of assembly or montage, so we could consider metadata production to be the result of the combined efforts of quite separate skills. Perhaps it is time for the Metaphiles to talk more about the art of metadata, about how images and sounds can also be metadata and about the new literacy of this emerging form of expression."
April 22, 2004
Google Teams Up with DSpace Revisited
Some people have said that this may not be a good idea. The author, however, may be missing the point. Although the concern over the current usefulness of the scope of the searchable material is perhaps valid, Google's service is more of a beta test than functional software.
I personally like the ability to restrict my searches and filter the results from the outset. Google already allows people to restrict results to specific domains, so why not the ability to restrict results based on more abstract filters? I agree with the author that this policy "sounds good," but I think Google should be given more leeway as the technology develops and DSpace (and other institutional repositories) become more common.
April 16, 2004
Amazon's A9 Search Portal Launched
Open Archives Initiative Data Providers
Open Archives Initiative Data Providers - Part I, Gerry McKiernan's eProfile column from the Apr 04 issue of Library Hi Tech News: "In this first of a series, we profile more recently established Open Archives Initiative (OAI) Data Providers whose content is not only 'harvestable' by OAI Service Providers.....but perhaps more importantly, offer open access to institutional and discipline information resources in a wide variety of publication and media formats."
Social Harvesting of Community Knowledge
Another interesting HP Information Dynamics Lab project is Social Harvesting of Community Knowledge (SHOCK). "Shock is designed as low-cost, extensible, flexible, and dynamic peer-to-peer knowledge network that helps address this problem. The system is designed to protect the privacy of user's personal information, such as email, web browsing habits, etc., while making that information available for knowledge management applications. It reduces participation costs for such applications as expert-finding, allows highly targeted messaging, and enables novel kinds of ad hoc conversation and anonymous messaging. The system is tightly integrated with users' email clients, taking advantage of email as habitat."
April 15, 2004
Google Teams Up with DSpace
A pilot project is underway, between Google and 17 universities currently running DSpace, to make those institutions' collections of scholarly papers searchable through Google's advanced-search page. (Chronicle of Higher Education, 9 Apr 04)
DSpace User Group Meeting
The presentations, summary and outcomes for the DSpace User Group Meeting (10-11 Mar 04) are now available.
April 08, 2004
Information Dynamics in Blogspace
Visual Text Mining
Blog-Fu highlighted this visual text mining tool on 7 Apr 04. "txtkit is an Open Source visual text mining tool for exploring large amounts of multilingual texts. It's a multiuser-application which mainly focuses on the process of reading and reasoning as a series of decisions and events. To expand this single perspective activity txtkit collects all of the users mining data and uses them to create content recommendations through collaborative filtering..... txtkit has an open and variable architecture which allows you to add multiple sources in different languages on different servers."
April 06, 2004
Cory Doctorow's Metacrap: Putting the torch to seven straw-men of the meta-utopia was published way back (26 Aug 01), but remains a nice reality check for anyone knee-deep in metadata. "A world of exhaustive, reliable metadata would be a utopia. It's also a pipe-dream, founded on self-delusion, nerd hubris and hysterically inflated market opportunities."
Interoperability between Information, Learning Environments
CNI and the IMS Global Learning Consortium have released a whitepaper, Interoperability between Information and Learning Environments, which examines "potential interactions between information environments and learning environments, with emphasis on work that needs to be done involving standards, architectural modelling or interfaces..." (Reviews and comments go to Cliff Lynch or Neil McLean.)