On 2/24/21 12:31 AM, Oliver Simmons wrote: [snip] > # My conclusion on the metadata topic > > I think the discussion of the how (format and what it would affect) has > been good, but we should stop with it untill a good why (reasoning and use > case) has been found. [snip] One example: when the capsule's owner has included date metadata, then searches there can be expanded or narrowed according to date. The same for title, author, subject, or any other field that has been maintained throughout the capsule. With machine readable metadata in place, the system can then do the work when limiting searches. Another example: browsing through a 'tag cloud', also known as subject categories, is very common. That is one way of browsing through a set of subjects in document metadata and is a case where it is not niche. An example of when metadata is is not allowed or missing would be plain full-text searches. One well-known confound for full-text searching is when a page talks about a topic in great detail without actually including repetition of strings pertaining to that topic. A lot of technical writers within ICT know of this problem and pepper their writing with expected search terms. Writers, especially researchers, in other fields just write to the topic and might not even include the subject terms more than once if even that. Thus full-text searching is quite inaccurate, even with stemming, and does not scale well. The actual metadata content can be made up as needed, as in uncontrolled vocabulary, or it can conform to an agreed upon, restricted set, such as ERIC or LCSH. After capsules reach substantial size, both in number and length of documents, it becomes impractical to rummage for content manually and full-text becomes increasingly inaccurate. It gets even harder to do relevant searches if the content in the pages is all about the same topic or similar, overlapping topics which share vocabulary. There, without the fielded searches which document metadata enables, the material is effectively lost. So some of use-cases for document metadata deal with trying to retrieve material from largish capsules. /Lars
---
Previous in thread (38 of 99): 🗣️ Petite Abeille (petite.abeille (a) gmail.com)
Next in thread (40 of 99): 🗣️ Jason McBrayer (jmcbray (a) carcosa.net)