Human Search Engine
Human Search Engine is a person or persons
searching for knowledge,
information or
answers
on the
internet. A human search
engine uses multiple resources and multiple websites and uses
multiple search engines using multiple keywords
and phrases in order to locate
relevant information. A human search engine
may also use other media sources such as television, movies, documentaries,
radio, magazines, newspapers, advertisers, as well as, recommendations
from other people. The main goal of a human
search engine is to find
relevant information and useful links to websites that pertain to a particular subject.
A human search engine is indexed by human
eyes and not by
algorithms.
Discoverability is the degree to which something, especially a piece
of content or information,
can be found in a search of a file,
database,
or other
information system. Discoverability is a concern in library and
information science, many aspects of digital media, software and web
development, and in marketing, since products and
services cannot be used
if people cannot find it or do not understand what it can be used for.
Search Engine Types -
Search Engine Flaws -
Search
Technology
-
Web Types -
Internet Explained
Findability is the ease with which information contained on a website
can be found, both from outside the website and by users already on the
website, or from using
search engines.
I'm more than a Human Search Engine
I am a Human Search Engine, but I'm much more than that. I'm
on a quest to understand the meaning of
human intelligence. I'm also involved in the never ending process of finding ways to
improve education, as well
as understanding how
the
public is informed about the
realities of our world
and
our current situation.
"The collector is the true resident of the interior. The
collector dreams his way not only into a distant or bygone world,
but also
into a better one" -
Walter Benjamin.
Synthesizer is an
intellectual who synthesizes or uses synthetic
methods to
combine
separate elements of information to form a
coherent whole, or to
combine so as to form a
more complex product and produce a new and higher level of truth. A type
of
reasoning from the general to the particular using logical deduction.
Knowledge Engineer is a
professional
engaged in the science of building
advanced logic into computer systems in order to try to
simulate human decision-making
and
high-level cognitive
tasks. A knowledge engineer supplies some or all of the "knowledge"
that is eventually built into the technology.
Knowledge Worker.
Retrieval Augmented Generation is an AI
framework that combines the strengths of traditional
information retrieval systems (such as
databases) with the capabilities of generative
large
language models.
"We are
drowning in
information, while
starving for wisdom.
The world henceforth will be run by synthesizers, people able to
put together the right information at the
right time, think critically about it, and make important choices
wisely." ~
E. O.
Wilson
World Brain
is a world
encyclopedia that could help
world citizens make the
best use of
universal information resources and make the best contribution
to world peace.
Co-Creation.
Ontology is a
knowledge domain that
is usually
hierarchical and contains all the relevant entities
and their
relations.
Ontology is the
philosophical study of the nature of being, becoming, existence, or
reality, as well as the basic categories of being and their relations.
Ontology in
information science is a formal naming and definition of
the types, properties, and interrelationships of the entities that really
exist in a particular domain of discourse. Thus, it is basically a
taxonomy.
Aggregator is a website or program that collects related items of
content and displays them or links to them. Software or a website that
collects and displays information from different
sites. A person who collects things.
Web Crawler is a
meta-search engine that blends the
top search results.
Web Robot is a software application that runs
automated tasks or scripts over the
Internet.
Filtering -
Defragging
the
Fragmented
-
Curation
-
Organizing Wiki Pages -
Information AgeScribe
is a person who serves as a
professional copyist, especially one who made
copies of manuscripts before the invention of
automatic printing. The
profession of the scribe, previously widespread across cultures, lost most
of its prominence and status with the
advent of the printing press. The
work of scribes can involve copying manuscripts and other texts as well as
secretarial and administrative duties such as the taking of dictation and
keeping of business, judicial, and historical records for kings, nobles,
temples, and cities. The profession has developed into public servants,
journalists, accountants, bookkeepers, typists, and lawyers. In societies
with low literacy rates, street-corner letter-writers (and readers) may
still be found providing scribe service.
Internet Mining
is the application of
data mining techniques that are used to discover
patterns
in the World Wide Web.
Web mining can be divided into three
different types. Web
usage mining,
web content mining and web
structure
mining
Information
Retrieval is the process of obtaining information system resources
that are
relevant to an
information need from a collection of those resources.
Relevance in information retrieval denotes how well a retrieved
document or set of documents
meets the information need of the user.
Relevance may include concerns such as timeliness, authority or novelty of
the result. Documents which are most
relevant are not
necessarily those which are most
useful
Knowledge is Gold,
and I'm
gold panning and mining the world for knowledge by
sifting through tons of information. I'm like a
screen, and I'm also like a metal detector. I detect valuable knowledge
using my senses.
Knowledge Mining uses a
combination of intelligent processes to
explore vast amounts of information and quickly learn from it and
uncover hidden insights and find relationships and patterns at scale,
which allows a deep understanding of information.
I'm an internet
miner
exploring the
world wide web. I'm
an
archivist
of information and
knowledge.
I'm extracting and aggregating the most valuable
information and
collecting the most informative websites
that the internet and the world has to offer.
I'm an
information architect
who is
filtering and
organizing the
internet one website at a time. I'm a knowledge moderator, an internet scribe,
and an
intelligent agent, like
Ai. But
it's
more than that. I'm an accumulator of knowledge who seeks to pass knowledge on to
others. Lowering the
entropy of the system since 2008.
Welcome to my
journey in
hyperlink heaven. Over
24 years of internet searches that
are organized, categorized and contextualized. A
researchers dream.
I
have asked the internet hundreds of thousands of different questions,
and I have already clicked my
mouse over a 10 million times, and I've only just begun.
I have tracked over 90% of my online
activities since 1998, so my
digital trail is a long one.
This
is my story about one mans journey
through the Internet.
What if you shared everything you learned?
Did you ever wonder?
To put it simply,
I'm organizing the internet.
Over the last 23
years since 1998, I have been surfing the world wide web, or
trail blazing the internet, and
curating
my experience. I've asked
the internet well over 500,000 questions so far. And from those
questions I have gathered a lot of Information, Knowledge and
Resources. So I then organized this Information, Knowledge and
Resources into
categories. I then published it on my website so that
the Information, Knowledge and Resources can be shared and used
for educational purposes. I also share what I've personally
learned from this incredible endless journey that I have taken
through the internet. The internet is like the universe, I'm not
over whelmed by the
size of the internet, I'm just
amazed from all the things that I have learned, and wondering
just how much more will I be able to understand. Does knowledge
and information have a limit? Well lets find out.
Adventure for
me has always been about discovering limits, this is just
another
adventure. I'm an internet
surfer who has been riding the perfect wave for over 12 years.
But this is nothing new. In the early 1900's,
Paul Otlet pursued his quest to organize the world’s
information.
A human
search engine
is someone who is not manipulated by money or manipulated by defective and
ineffective algorithms.
A human search engine is
created by humans
and is a service for humans. People want what's
important. People want the most valuable knowledge and
information that is available, without stupid adds, and without
any ignorant manipulation or
censorship. People want a trusted source for information, a
source that cares about people more than money. We don't
have everything, but who needs everything?
I'm a
pilgrim on a
pilgrimage.
I'm an internet pathfinder whose task it is to carry out
daily internet reconnaissance missions and document my
findings. I'm not an
internet
guru or a
gatekeeper, but I have created an excellent
internet resource.
Our physical journeys in the world are just as important as our
mental explorations in the
mind, the
discoveries are endless.
These days I seem to be leaving more
digital footprints than actual footprints,
which seems more meaningful in this day and age.
Quest is
the act of searching for something, searching for
an
alternative that meets your needs.
Quest is a difficult
journey towards a
goal, often symbolic, abstract in
idea or
metaphor. An
adventure.
I'm more of a
knowledge
organizer and Knowledge Sharer than a
knowledge keeper. I
also wouldn't say that I'm
a wisdom keeper, I am more of a wisdom sharer, which makes everyone a
wisdom beneficiary. This is
my
legacy.
I'm just a busy bee in the hive of Knowledge, doing my part to keep the hive
productive.
Beehive is an enclosed structure in which some honey
bee
species of the subgenus Apis live and raise their young.
Knowledge Hive
-
Knowledge Hives
-
The Hive
Knowledge Platform (youtube)
Honeycomb is a mass of
hexagonal prismatic wax cells built by honey bees in their nests to
contain their larvae and stores of honey and pollen.
Polyhedron.
“
For every minute spent in organizing
is an hour is earned.”
I feel like a
human conduit, a passage, a pipe, a tunnel or a channel for
transferring information and synchronizing information to and from various
destinations.
Vannevar Bush envisioned the internet before modern
computers were being used.
Mundaneum is a non profit organization based in Mons,
Belgium that runs an exhibition space, website and
archive which celebrate
the
legacy of the original Mundaneum established by Paul Otlet and Henri
La Fontaine in the early twentieth century.
Feltron.
Two Directory
Projects are the work accumulated from one Human Editor -
The Power of One (youtube)
Looking for Adventure.com has over
60,000 handpicked Websites. (External Links) - LFA took
14 years to accumulate as of 2016.
Basic
Knowledge 101.com has over
50,000 handpicked Websites. (External Links) - Took 8 years
to accumulate as of 2016. And as of 2022, it has grown significantly.
The Internet and Computer Digital
Information combined allows a person to save the work that they have done
and create a living record of information and experiences.
Example '
Looking for Adventure.com ' "not a total copy of my
life but getting close". Things don't have to be
written in stone
anymore, but it doesn't hurt to have an
extra copy.
When I started
in 2008, I didn't know how much knowledge and information I would find,
or did I know what kind of knowledge and information I would
find, or did I know what kind of benefits would come from this
knowledge and information. Like a miner in the olds days, you
dig a little each day and see what you get. And wouldn't you
know it, I hit the jackpot. The wealth of information and
knowledge that there is in the world is enormous, and
invaluable. But we can't celebrate just yet, we still
need to distribute our wealth of knowledge and information and
give everyone access. Other wise we will never fully benefit
from our wealth of knowledge and information, or we will ever
fully benefit from the enormous potential that it will give us.
"I saw a huge unexplored ocean, so naturally I dove in to take a look. 8 years later in
2016, I have been exploring this endless sea of knowledge, and
have come to realize that I have found a home."
About my Research.
What being a Human Search Engine Represents
A human search engine is more then just a
website with
hyperlinking, and it's more then just
an Information hub or
a
node with
contextual information and
structured grouping.
A Human Search Engine is also more than just
knowledge organization, it's a branch of
library and
information science concerned with activities such as document
description, indexing and
classification performed in libraries,
databases, archives, etc..
Intelligence Gathering is a method by which a country
gathers information using non-governmental employees.
Internet Aggregation refers to a web site or computer
software that
aggregates a
specific type of information from
multiple online
sources.
Knowledge Extraction is the creation of
knowledge from
structured
relational databases
or XML, and unstructured text, documents,
images and sources. The resulting knowledge needs to be in a
machine-readable
and machine-interpretable format and must represent knowledge in a manner
that facilitates
inferencing.
Information Extraction -
Information Filtering System -
Knowledge Management -
Database Indexing -
File System -
Knowledge Base -
Knowledge Management -
Media
Curation -
Digital
Curation -
Documentation
Extract, Transform, Load is a process in database usage and
especially in data warehousing that
extracts
data from homogeneous or heterogeneous data sources. Transforms
the data for storing it in the proper format or structure for
the purposes of
querying and analysis. Loads it into the final target such
as a
database or operational data store, data mart, or data
warehouse.
Glean is to
extract information from
various sources. Gather, as of natural products. Accumulate
resources.
Academics - My Fundamental Contribution
In a way my work as a
Human Search Engine is my
dissertation.
My
thesis is
Basic Knowledge 101 and proving
the importance of a
Human Operating System in
regards to having a more comprehensive and
effective education. This is my
tenure. My Education Knowledge Database Project. This
is just the beginning of my
intellectual works.
Basic Knowledge 101.com is my
curriculum vitae. Working on this project I went from an
undergraduate
study, through
postgraduate
education right into a
graduate program.
I started out as a non-degree seeking student but I ended up
with a
master's degree and a
doctoral degree, well almost.
I have done my fieldwork, I have acquired specialized skills, I
have done advanced original
research. But I still have no name for my
Advanced Academic Degree. Maybe "
Internet Comprehension 101".
My Business Card -
My Legacy -
HyperLand
(youtube)
Academic Tenure is defending the principle of
academic freedom, which
holds that it is beneficial for society in the long run if scholars are
free to hold and examine a variety of views.
Tenure is to give someone a
permanent post, especially as a teacher
or professor.
Information Science
-
Peer-to-Peer -
Open Source -
Free Open Access.
Internet Studies
is an
interdisciplinary field studying the
social, psychological, pedagogical, political, technical, cultural,
artistic, and other dimensions of the Internet and associated information
and communication technologies.
Internet and
society is a research field that addresses the interrelationship of
Internet and
society, i.e. how
society has changed the Internet and how the Internet has changed society.
How Can One Person Create Databases this Large in Such a
Short Time
The techniques and methods are quite simple when you're using
the
Internet. You literally have
thousands upon thousands of smart people indirectly doing a tremendous
amount of work for you. This gives individuals the power and the ability
to solve almost any problem. Sharing information and knowledge on a
platform that millions of people can have access to has transformed our
existence in so many ways that people cannot even comprehend the changes
that are happening now, or have happened already, and will most likely
happen in the future.
I worked fast,
searching fast, reading fast, collecting fast and organizing fast, then I
would eventually go back and clean up any mistakes or errors, which could
be weeks or months later. This is because the more you learn, the more you
realize that you have a lot more to learn. The quest for knowledge is a
self motivating process. The more questions you ask, the more answers you
find, and almost every answer creates more questions, which means that you
will have more answers, which a lot of times, creates more questions. And
this is because many things can't be explained by simple answers. These
learning journeys takes your mind to places that you never thought of or
even dreamed of. It is uncharted territory. A dark and empty place that
your mind has never been before. And your thinking takes you there, and
when you enter that place, you turn on the lights and you start to examine
this space. And then you ask yourself, "How did I get here? Do I stay? And
where do I go from here? But what ever you decide, the light stays on. The
light is a beacon and a symbol of where your mind has been. And that light
you turned on is connected to millions of other lights that your learning
journeys have created. And those lights as a whole, is your knowledge. And
this knowledge grows. The more lights you have on, the more you can see,
and there more you will be aware of. And as your awareness increases, your
understanding will increase over time as well. And this understanding is
the key to it all, understanding is the key to everything.
First Step: Start making categories.
This is a must because when doing internet searches, for what ever reason,
you are bound to come across information and knowledge that is related to
that particular category that you were not directly searching for. You
will always find information in different places. So you must be ready to
recognize this information and know where it belongs in the category that
you have already created. This is very important because you will most
likely never come across the same information related to those particular
search parameters again, so saving and documenting your findings is very
important. A good tip when searching the internet is when reading the
results that a search engine gives you, find other keywords phrases that
relates to your subject matter and then you do more searches
useing those keywords and phrases.
Terminology Extraction.
Second Step: When reading, watching TV, watching a movie or even
talking with someone, you are bound to come across ideas and
keywords that you could use when searching for more information
pertaining to your subject. Then again saving and documenting
your findings is very important. It's always a good idea to have
a pen and paper handy to write things down or you can use your
cell phone to record a voice memo so that you don't forget your
information or ideas. The main thing is to have a subject that
you're interested in and at the same time being aware of what
information is valuable to your subject when it finally presents
itself. Combining a
human algorithm with a
randomized
algorithm.
Third Step: Organizing, updating and improving your database
so that it stays functional and easy to access. So my time is
usually balanced between these three tasks, and yes it is time
consuming. You can also use the
Big 6 Techniques when gathering
Information to help with your efficiency and effectiveness.
I also created a
Internet Searching Tips help Section for useful ideas.
Glossary.
One Last Thing: If you spend a lot of time on the internet
doing searches and looking for answers, then you are bound to come
across some really useful websites and information that were not
relevant to what you were originally searching for. So it's a
good idea to start saving these useful websites in new
categories or just save them in a appropriate named folder in
your documents. This way you can share these websites with
friends or just
use them at some later time.
It is sometimes called Creating Search
Trails, which I have 21 years worth as of 2019. Not bad for a
personal web page.
What have I Learned about being a Human Search Engine
I am a
semantic web as well as a Human Search Engine.
Intelligent humans will
always be better than
machines when it comes to
associations,
perceptions,
perspectives,
categorizing and
organizing,
because there are many things that need to be done manually, especially
when it comes to organizing information and knowledge.
Linking
data,
ontology learning,
library and information science, creating a
visual thesaurus and
tag clouds is what I have been doing for 10 years. " Welcome
to
Web 3.0." I'm an
intelligent agent combining
logic and
fuzzy logic, because there are just some things that
machines or
Artificial Intelligence cannot
do or do well.
Automated
reasoning systems and
computational logic can only do so much. So we need more
intelligent humans than computer
algorithms.
Creating knowledge bases is absolutely essential. This is why I believe
that having more Human Search Engines is a benefit to anyone seeking
knowledge and information. Structuring websites into syntax
link patterns and information into categories or
taxonomies without being objective or impartial.
Organizing information
and websites so that visitors have an easy time finding what they're
looking for, plus at the same time, showing them other things that are
related to that particular subject that might also be of interest to them. More relevant choices and a great alternative and
complement to
search engines. But it's not
easy to manage and maintain a human search engine, especially
for one person. You're constantly updating the link data base,
adding links, replacing links or removing some links altogether.
Then on top of that there's the organizing and the adding of
content, photos and video. And all the while your website grows
and grows. Adding related subjects and subcategorizing
information and links. Cross linking or
cross-referencing so that
related information can be found in more then one place while at
the same time displaying more
connections and more
associations.
Interconnectedness
-
Human
Based Genetic Algorithms -
Principle
of Least Effort -
Abstraction -
Relational Model.
Visible Web - World Wide Web - Dark Web
I have always used the world wide
web or the
surface web for my work. And I have always had a good
connection,
but not a totally secure
connection. So just in case you need to
search the web when the main
stream web becomes too risky, you should now about your alternatives.
"Lets share what we know."
World
Wide Web is an information space of
networked computers where
documents and other
web resources are identified by
Uniform
Resource Locators interlinked by
hypertext
links, and can be accessed via the
Internet.
The URL is a
website address.
World Wide Web or
WWW is commonly known as the Web. It is the world's dominant software
platform. It is an
information space where documents and other web
resources can be accessed through the Internet using a web browser. The
Web has changed people's lives immeasurably. It is the primary tool
billions of people worldwide use to interact on the Internet.
The world wide web is 30 years old in 2019. On
March 12, 1989,
Sir Tim Berners-Lee published his proposal for connecting information
together so that it could be easily shared and accessed, describing how a
“‘web’ of notes with links (like references) between them is far more
useful than a fixed hierarchical system.” But the astronomical growth of
the web could also be its downfall, Berners-Lee warns.
The internet and the world wide web are not the same thing, even
though the terms are sometimes used synonymously. The
internet is a huge
network of
computers, including
servers and data centers located all around the world
– quite literally hundreds of millions of them. The world wide web is one
of the ways people can access the information stored on the internet. When
you send emails they travel via the internet – but not via the web. If
you’re a user of an instant messaging app such as WhatsApp, that uses the
internet but not the web, and the same is true of services like Skype. The
world wide web uses something called hypertext transfer protocol (the HTTP
you’ve seen in web addresses) as the basis for communicating and
distributing information. A)
Hyper Text Markup Language or HTML. This is the series of formatting
tags and codes used on the web to pull information together and create
links. It is also used to change the way information looks. For example,
to make the word penguin appear bold you would include the following HTML
tags, <b>penguin</b>. B) Hypertext Transfer Protocol or HTTP. This is a
protocol – an agreed and standard way of doing something. In this case,
it’s referring to the way information residing somewhere online is
connected to, how it links to other resources, and how it is then
delivered to the user’s screen across the web. Other protocols are used
for other services that work over the internet. In the case of email, for
example, you will find MAILTO to indicate which protocol is in use. C)
Uniform Resource Identifier or URI. You can think of this as a unique
address used to identify the location and properties of each resource
available on the internet. This is particularly relevant here because it
allows for instant identification of information on the world wide web.
You may be familiar with the term URL (uniform resource locator). It’s a
form of URI. Let’s look at this example of a URI:
https://www.weforum.org/ to understand how it works. HTTPS – tells you
this is hypertext-based, therefore it’s on the web. If it said FTP
instead, you’d know it was a file transfer site. The S indicates
encryption is being used for additional security.
Web is an
interconnected system of
things or people. A
computer network
consisting of a collection of
internet sites
that offer text and graphics and sound and animation resources through the
hypertext transfer protocol. An intricate network suggesting something
that was formed by
weaving or
interweaving. A fabric (especially a fabric in the process of being
woven). An intricate trap that
entangles or ensnares its victim.
The splinternet is perhaps the most significant
threat to the web – and without doubt, the most existential one –
is the development of something most commonly referred to as the
splinternet. This refers to the possibility of the web (and indeed the
internet in its entirety) being broken into smaller, regional pieces. In
part it is a reaction by some national governments to what they see as the
undue influence of a small number of tech giants. As those national
governments implement different levels of regulation, the very idea of a
world wide web is cast into doubt. What was once envisaged as a seamless,
borderless, even playing field offering a uniform online experience for
everyone no matter who they are, will be no more.
Splinternet is a characterization of the Internet as
splintering and
dividing due to various factors, such as technology, commerce, politics,
nationalism, religion, and divergent national interests. "Powerful forces
are threatening to balkanise it", writes the Economist weekly, and it may
soon splinter along geographic and commercial boundaries. The Chinese
government erected the "
Great Firewall" for political reasons, and Russia
has enacted the Sovereign Internet Law that allows it to
partition itself
from the rest of the Internet, while other nations, such as the US and
Australia, discuss plans to create a similar firewall to block child
pornography or weapon-making instructions. Clyde Wayne Crews, a researcher
at the Cato Institute, first used the term in 2001 to describe his concept
of "parallel Internets that would be run as distinct, private, and
autonomous universes." Crews used the term in a positive sense, but more
recent writers, like Scott Malcomson, a fellow in New America's
International Security program, use the term pejoratively to describe a
growing threat to the internet's status as a globe-spanning network of
networks.
Surface Web is that portion of the World
Wide Web that is
readily available to the general public and
searchable with standard web search engines. It is the opposite of the deep web.
The surface web is also called the Visible Web, Clearnet,
Indexed
Web, Indexable Web or Lightnet.
The Deep Web consists of those pages that Google and other
search engines don't index.
The Deep Web is about 500 times larger than the Visible Web, but
the Visible Web is much easier to access.
Deep Web (wiki).
The Dark Web is an actively hidden, often
anonymous part of the
deep web but it isn't inherently bad.
Dark Internet
(wiki)
Black Market -
Black Budget -
Shadow Government -
Monopolies
Deep Web Exploring the
Dark Internet, the part of the internet that very little
people have ever seen.
Memex (wiki).
How the Mysterious Dark Net is going Mainstream (video)
Virtual Private Network -
Tor Project
Google has indexed 1 trillion pages so far in
2016, but that is
only 5% of the total knowledge and information that we have.
Master
Directory is a file system cataloging structure which
contains references to other computer files, and possibly other
directories. On many computers,
directories are known as folders, or
drawers to provide some relevancy to a workbench or the traditional office
file cabinet.
Web Directory is a directory on the World Wide Web. A
collection of data
organized
into categories. It specializes in linking to other web sites and
categorizing those links.
Web of Knowledge.
Website Library -
Types of Books
Web indexing refers to various methods for
indexing the
contents of a website or of the Internet as a whole. Individual websites
or intranets may use a back-of-the-book index, while search engines
usually use keywords and metadata to provide a more useful vocabulary for
Internet or onsite searching. With the increase in the number of
periodicals that have articles online, web indexing is also becoming
important for periodical websites.
Web Index.
Semantic Web is an extension of the Web through standards by the
World
Wide Web Consortium or
W3C. The standards promote
common data formats and
exchange protocols on the Web, most fundamentally the
Resource Description
Framework or RDF. The
Semantic Web provides a common framework that
allows data to be shared and reused
across application, enterprise, and community boundaries. The Semantic Web
is therefore regarded as an
integrator across different content,
information applications and systems.
The goal of the
Semantic Web is to make Internet data machine-readable.
Semantic Web Info.
Web 2.0
describes the current state of the internet, which has more user-generated
content and usability for end-users compared to its earlier incarnation,
Web 1.0. Web
2.0 generally refers to the 21st-century internet applications that have
transformed the digital era in the aftermath of the dotcom bubble. Web 2.0
is also known as participative or participatory web and social web, which
refers to websites that emphasize user-generated content, ease of use,
participatory culture and interoperability (i.e., compatibility with other
products, systems, and devices) for end users. -
Web 3.0
Machine-Readable Data is data in a format that can be
processed by a computer. Machine-
readable data must be structured data
and in a format that can be easily processed by a computer without human
intervention while ensuring no semantic meaning is lost. Machine readable
is not synonymous with
digitally
accessible. A digitally accessible document may be online, making it
easier for humans to access via computers, but its content is much harder
to extract, transform, and process via computer programming logic if it is
not machine-readable.
Extensible
Markup Language (XML) is designed to be both human- and
machine-readable, and
Extensible
Style Sheet Language Transformation (XSLT) is used to improve
presentation of the data for human readability. For example, XSLT can be
used to automatically render XML in
Portable Document Format (PDF). Machine-readable data can be
automatically transformed for human-readability but, generally speaking,
the reverse is not true.
Web Portal is a specially designed web site that brings information together
from diverse sources in a uniform way.
Filtering - Gatekeeping
Information Filtering System is a
system that removes
redundant or
unwanted information from an
information stream, using
semi
automated or computerized methods prior to
presentation to a
human user.
Its main goal is the
management of
any
information overload,
propaganda or
errors, and the
signal-to-noise ratio. To do this the user's
profile is
compared to some
reference characteristics. These
characteristics may originate from the information item using a the content-based
approach, or from the user's social environment using the
collaborative filtering
approach. Filtering should never create a
filter bubble that
influences
biases or blind
conformity.
Filtering is not to be confused with
censorship. A filter is
not a
wall. Filtering is like
speed reading.
Retrieving the most essential information efficiently as possible.
Information Overload
-
Relative -
Tuning Out Irrelevant
Information -
High Order
Brain Regions -
Mindful
Filter is a device that
removes something from whatever
passes through it. A
porous device for
removing impurities or solid particles from a liquid or gas passed through
it.
Porous is something full of pores or
vessels or holes allowing passage in and out.
Internet filter is software that restricts or controls the content an
Internet user is capable to access, especially when utilized to restrict
material delivered over the Internet via the Web, Email, or other means.
Content-control software determines what content will be available or be
blocked.
Membrane Filters -
Cell Membrane -
Polarizers -
Distillation -
Error
Correcting -
Reptilian
Brain -
Ratings -
Free Speech Abuses -
Social Network
Monitoring -
Search Engines
Data Cleansing is the
process of
detecting and correcting or
removing corrupt or inaccurate records from
a record set, table, or database and refers to identifying incomplete,
incorrect, inaccurate or
irrelevant parts of the data and then replacing,
modifying, or
deleting the dirty or coarse data. Data cleansing may be performed
interactively with data wrangling tools, or as
batch processing through
scripting.
Gatekeeping
is the
process through which information is
filtered for
dissemination, whether for publication, broadcasting, the Internet, or
some other mode of
communication. The academic theory of gatekeeping is founded in multiple
fields of study, including communication studies, journalism, political
science, and sociology. It was originally focused on the mass media with
its few-to-many dynamic but now gatekeeping theory also addresses
face-to-face communication and the many-to-many dynamic inherent in the
Internet. The theory was first instituted by social psychologist Kurt Lewin in 1943. Gatekeeping occurs at all levels of the media
structure—from a reporter deciding which sources are chosen to include in
a story to editors deciding which stories are printed or covered, and
includes media outlet owners and even advertisers.
Wisdom Keeper.
Logic Gate
- And, Or, Not
questions? -
Sensory
Gating
Gatekeeper are
individuals
who decide whether a given message will be distributed by a
mass medium. Serve in various roles including academic admissions,
financial advising, and news editing.
Not to be confused with Mass Media.
Collaborative Filtering is the process of filtering for information or
patterns using techniques involving
collaboration among multiple agents,
viewpoints, data sources, etc. Sometimes making automatic
predictions
about the interests of a user by collecting preferences or taste
information from many users.
Collaborative Filtering is a technique used by recommender systems.
Collaborative filtering
has two senses, a narrow one and a more general one.
Process of Elimination is
a quick way of finding an answer to a
problem by excluding
low probability answers so that you can focus on the
most probable answers. With
multiple
choices you can remove choices that are known to be incorrect so that
your
chances of getting the correct answer
are greater. It is a logical method to identify an entity of interest
among several ones by excluding all other entities.
In
educational
testing, the process of elimination is a process of deleting options
whereby the possibility of an option being correct is close to zero or
significantly lower compared to other options. This version of the process
does not guarantee success, even if only 1 option remains since it
eliminates possibilities merely as improbable.
Reason by Deduction -
Simplifying -
Data Conversion
Filter in
signal processing is
a device or process that removes some unwanted components or
features from a signal. Filtering is a class of signal
processing, the defining feature of filters being the complete
or partial suppression of some aspect of the signal.
Noise.
Media Literacy -
Sensors -
Social Network
Blocking
Abstraction is the act of
withdrawing or
removing something. A general
concept
formed by extracting
common features from
specific examples. The process of
formulating general concepts by
abstracting common properties of instances. A concept or idea not
associated with any specific instance. Preoccupation with something to the
exclusion of all else.
Abstraction
is a conceptual process by which general rules and concepts are derived
from the usage and
classification of
specific examples. Conceptual
abstractions may be formed by filtering the
information content of a
concept or an observable phenomenon, selecting only the aspects which are
relevant for a particular
purpose.
Extracting is to
reason by deduction a
principle or construe or
make sense of a meaning. Extracting in
chemistry
is to purify or isolate using
distillation. Obtain from and separate a
substance, as by mechanical action. Extracting in mathematics is to
calculate the root of a number.
Extraction (information) -
Connectome
Extract, Transform, Load is the general
procedure of copying data from
one or more sources into a destination system which represents the data
differently from the source(s) or in a different context than the
source(s).
Aggregate is to
form and
gather separate units into a
mass or
whole.
Multimedia Information Retrieval is a research discipline of computer
science that aims at extracting
semantic
information from
multimedia data
sources.
Semantic Web.
Archivist
is an
information professional who
assesses, collects,
organizes,
preserves, maintains control over, and provides access to
records and
archives determined to have
long-term value.
Data Migration is the process of selecting, preparing, extracting, and
transforming data
and permanently
transferring it from one computer storage system to
another.
Data Integration involves combining data residing in different sources
and providing users with a
unified
view of them.
Information Integration is the
merging of information from heterogeneous sources with differing
conceptual, contextual and typographical representations.
Knowledge Integration is the process of synthesizing multiple
knowledge models or
representations into a
common model or representation. Knowledge integration focuses on
synthesizing the understanding of a given subject from different
perspectives. Knowledge integration has also been studied as the process
of incorporating new information into a body of existing knowledge with an
interdisciplinary approach. This process involves determining how the new
information and the existing knowledge interact, how existing knowledge
should be modified to accommodate the new information, and how the new
information should be modified in light of the existing knowledge. A
learning agent that actively investigates the consequences of new
information can detect and exploit a variety of learning opportunities;
e.g., to resolve knowledge conflicts and to fill knowledge gaps. By
exploiting these learning opportunities the learning agent is able to
learn beyond the explicit content of the new information.
Brain Plasticity.
Terminology
Extraction is a subtask of information extraction. The goal of
terminology extraction is to automatically extract relevant terms from a
given corpus. Collect a vocabulary of domain-relevant terms, constituting
the linguistic surface manifestation of domain concepts.
Data Cleansing -
Information Extraction
Data Scraping is a technique in which a computer program extracts data
from human-readable output coming from another program.
.
Web Scraping is
data scraping used for extracting data from websites sometimes using a
web crawler.
Screen Scraping is the process of
collecting screen display data from one application and translating it so
that another application can display it. This is normally done to capture
data from a legacy application in order to display it using a more modern
user interface.
Data
Editing is defined as the process involving the review and adjustment
of collected survey data. The purpose is to control the quality of the
collected data. Data
editing can be performed manually, with the
assistance of a computer or a combination of both.
Data Wrangling is the process of transforming and mapping data from
one "raw" data form into another format with the intent of making it more
appropriate and valuable for a variety of downstream purposes such as
analytics. A data wrangler is a person who performs these transformation
operations. This may include further munging, data visualization, data
aggregation, training a statistical model, as well as many other potential
uses. Data munging as a process typically follows a set of general steps
which begin with extracting the data in a raw form from the data source, "munging"
the raw data using algorithms (e.g. sorting) or parsing the data into
predefined data structures, and finally depositing the resulting content
into a data sink for storage and future use.
How do we learn to learn? New research offers an education. Cognitive
training designed to focus on what's important while
ignoring distractions can
enhance the brain's information processing, enabling the ability to '
learn
to learn,' finds a new study on mice. Notably, this process of
ignoring distractions was essential for the mice learning to learn as it
allowed them to do novel cognitive tasks better than the mice that did not
receive
Cognitive Control
Training or CCT. Remarkably, the researchers could measure that CCT
also improves how the mice’s hippocampal neural circuitry functions to
process information. The hippocampus is a crucial part of the brain for
forming long-lasting memories as well as for
spatial navigation,
and CCT improved how it operates for months.
Noisy Text Analytics is a process of information extraction
whose goal is to automatically extract
structured or semistructured information from
noisy unstructured text data.
Fragmented -
Deconstructed
Noisy Text
noise can be seen as all the differences between the surface
form of a
coded representation of the text and the intended,
correct, or original text.
Deep Packet Inspection is
a form of computer network packet filtering that examines the data part
(and possibly also the header) of a packet as it passes an inspection
point, searching for protocol non-compliance, viruses, spam, intrusions,
or defined criteria to decide whether the packet may pass or if it needs
to be routed to a different destination, or, for the purpose of collecting
statistical information that functions at the Application layer of the OSI
(Open Systems Interconnection model).
Focus -
Attention
-
Multi-Tasking
Filtering information is not bad, as long as you are filtering
correctly and focused on a particular
goal so that only
relative information needs
to be
analyzed. The big problem is that
people block relative information
and then they
naively call it
filtering, which it is not. Most people are not aware that they are
blocking relative information, or are they aware that they have biases
against certain information. So the main difficulty is that people don't
have enough knowledge and information in order to filter information
without blocking relative information or important information. The
process of
information extraction needs to be learned and then practiced,
and also
verified in order to
make sure that the process is effective and efficient. Eventually
misinformation would be
totally eliminated from the media because it could never get through the
millions of people who are knowledgeable enough to quickly identify
false information and then remove
it and also stop the source from transmitting. Millions of filters
will be working together to keep information accurate, that is the future.
Filtering is a normal human process, it's just when you filter things,
what things you are filtering, and why you filter things that makes all
the difference.
Watch Dogs.
Salience
Network is involved in detecting and filtering salient
stimuli, as well as in
recruiting
relevant
functional networks. Together with its
interconnected brain
networks, the SN contributes to a variety of complex functions,
including communication, social behavior, and self-awareness through the
integration of sensory, emotional, and cognitive information.
People Filters. There are many
different things that you need to learn in order to accurately understand
human behaviors. With certain
people, you need filters. And the filters must be able to be
modified when a persons
behavior changes, or when the environment changes. You need to be
mindful of the filters
that you use, and apply the filters deliberately. Like a switch, your
filters need an on button and an off button. You need to have intention
and
attention so as to avoid any
interference from
cross talk or noise that may come from your
normal processing parameters that you
use in everyday life. There is your professional work, and your
professional life. You need to have separation, but separation can
sometimes be difficult when your work and your life start to blend because
they share a lot of the same commonalities. It is the
context in which we use our filters
that makes the difference in how we
interpret things. Even though
being aware of everything is
impossible, and even though our awareness can become skewed at times,
we at least need to know what separates our work from our life, so that we
can avoid losing ourselves in the grey areas. Your
core self is your
sanctuary, and you need to protect it at all times. And remember, you need
to keep educating yourself because knowledge is your greatest protection
and the source of all your power.
Gold
Panning is a
mining
process to
extract gold using a flat pan that is used to
separate rocks
from dirt and then gold from dirt using water and a shaking and
swirling
technique. The basic idea is to almost fill the pan with dirt or
gravel that is found near a river, and then adding water to the pan and
then agitating the material in the pan so as to
stratify the dirt and
cause the heaviest stuff sink to the bottom and the lightest stuff rise to
the top. Pans have various designs that have been developed over the
years, the common features being a means for trapping the heavy materials
during agitation, or for easily removing gold at the end of the process.
Some pans are intended to be used with metal mesh screens. Mesh is a
barrier made of connected strands of metal, fiber, or other flexible or
ductile materials. A
mesh is
similar to a
web or a net in that it has many
attached or
woven strands creating
closely spaced holes that serves to keep leaves, debris, bugs, birds, and
other animals from entering a building or a screened structure such as a
porch, without blocking fresh air-flow. The average
screen has 325 openings in one square inch with a mesh size of 1.2
millimeters or 0.047 inches, this small hole size stops mosquitoes, and
smaller, such as 0.6 millimeters (0.024 in), stops other biting insects.
Mesh
is a measurement of particle size often used in determining the
particle-size distribution of a granular material. Many mesh sizes were
historically given in the number of holes per inch; due to the width of
the wires in the mesh, mesh numbers did not correspond directly to
fractional inch sizes, and several different systems standardized with
slightly different mesh sizes for the same mesh numbers.
Placer Mining is the mining of stream bed
alluvial deposits for minerals.
Alluvium
is loose clay, silt,
sand, or gravel that has been
deposited by running water in a stream bed, on a floodplain.
Stratification is the arrangement or
classification of something into different groups.
Sifting is the act of separating elements
by passing courser elements through a sieve or strainer or other straining
device in order to separate out other desired elements.
Centrifuge works by
rotating at rapid speeds, thereby separating substances using the power of
centripetal force which is achieved by spinning the fluid at high speed
within a container, thereby separating fluids of different densities (e.g.
cream from milk) or liquids from solids. It works by causing denser
substances and particles to move outward in the radial direction. At the
same time, objects that are less dense are displaced and moved to the
centre. The centrifugal force applied can reach several hundred or several
thousand times that of the earth's gravity.
Artificial
gravity is the creation of an inertial force that mimics the effects
of a gravitational force, usually by rotation. Artificial gravity, or
rotational
gravity, is thus the
appearance of a centrifugal force in a rotating frame of reference.
High-g training is done by aviators and astronauts who are subject to
high levels of acceleration ('g'). It is designed to prevent a g-induced
loss of consciousness (g-LOC), a situation when the action of g-forces
moves the blood away from the brain to the extent that consciousness is
lost. Incidents of acceleration-induced loss of consciousness have caused
fatal accidents in aircraft capable of sustaining high-g for considerable
periods. Centrifugal force produces up to 20 times that of terrestrial
gravity or the
gravity felt on land on
planet earth.
Orbits.
Data
Mining is the process of extracting and discovering patterns in large
data sets involving methods at the intersection of machine learning,
statistics, and database systems.
Information
Foraging is how human users search for information, which evolved to
help our animal ancestors find food.
Metal Detector
is an
instrument that detects the
nearby presence of metal. Metal detectors are useful for finding metal
objects on the surface, underground, and under water. The unit itself
consists of a control box, and an adjustable shaft, which holds a pickup
coil, which can vary in shape and size. If the pickup coil comes near a
piece of metal, the control box will register its presence by a changing
tone, a flashing light, and or by a needle moving on an indicator. Usually
the device gives some indication of distance; the closer the metal is, the
higher the tone in the earphone or the higher the needle goes. Another
common type are stationary "walk through" metal detectors used at access
points in prisons, courthouses, airports and psychiatric hospitals to
detect concealed metal weapons on a person's body.
The Information Age
We are now living in the
Information Age.
A time where
information and
knowledge is so abundant that we can no
longer ignore it. But sadly, not everyone understands what information is, or understands
the potential information, or has
access to information.
The Information age is
the greatest transition of the human race, and of
our planet. The power of knowledge is just beginning to be realized.
Knowledge and information gives us an incredible ability to explore
ourselves, and explore our world and our universe in ways that we have never imagined.
Knowledge and information can improve the lives of every man, women and
child on this planet. Knowledge and information will also help us
understand the importance of all life forms on this planet like never
before. This is truly the
greatest
awakening of our world.
Preserving Information
-
Information Economy -
Knowledge Economy -
Knowledge Market
-
Knowledge Management
-
Information Literacy
-
Information Stations
-
Information Overload.
Knowledge Open to the Public
Libre Knowledge is
knowledge released in such a way that users
are free to read, listen to, watch, or otherwise experience it; to learn
from or with it; to copy, adapt and use it for any purpose; and to share
the work (unchanged or modified).
Knowledge Commons refers to information, data, and content
that is collectively owned and managed by a community of users,
particularly over the Internet. What distinguishes a knowledge commons
from a commons of shared physical resources is that digital resources are
non-subtractible; that is, multiple users can
access the same digital
resources with no effect on their quantity or quality.
Open Science -
Open Source
Education -
Internet
Open Knowledge is
knowledge that one is free to use,
reuse, and redistribute without legal, social or technological
restriction. Open knowledge is a set of principles and methodologies
related to the production and distribution of knowledge works in an open
manner. Knowledge is interpreted broadly to include data, content and
general information.
Open Knowledge Initiative is an organization responsible for the
specification of software interfaces comprising a Service Oriented
Architecture (SOA) based on high level service definitions.
Open Access Publishing refers to online research outputs that are free
of all restrictions on access (e.g. access tolls) and free of many
restrictions on use (e.g. certain copyright and license restrictions)
Open Data is the idea that some data should be
freely
available to everyone to use and republish as they wish, without
restrictions from copyright, patents or other mechanisms of control.
Open Content describes a creative work that others can copy
or modify.
A Human Search Engine is a lot of work.
I have been working an average of
20 Hours a week since 1998 and over 50 Hours a week since 2006.
With over a
Billion Websites containing
over 450 billion web pages on the
World Wide Web, there's a lot of information to be
organized. And with almost 2 billion people on the internet
there's a lot of minds to collaborate with.
My Human Search Engine
design methods are always improving, but I'm definitely not
a professional
website architecture, so there is always more to learn. I'm constantly
multitasking so I do make mistakes from time to time,
especially with
proof reading my own writing,
which seems almost impossible. This is why writers and authors have proof
readers and copy editors, which is something I cannot afford
right now, so please excuse me for my spelling errors and poor
grammar. Besides that I'm still making progress and I'm always
acquiring new knowledge, which always makes these projects
fascinating and never boring.
The Adventures in Learning
You can also look at my website as web Indexing.
Web indexing
means creating indexes for individual Web sites, intranets,
collections of HTML documents, or even collections of Web sites.
Web-indexing.org
Indexes are
systematically arranged items, such as topics
or names, that serve as entry points to go directly to desired
information within a larger document or set of documents.
Indexes are traditionally
alphabetically arranged. But they may
also make use of
Hierarchical Arrangements, as provided by thesauri, or they
may be entirely hierarchical, as in the case of taxonomies. An
index might not even be displayed, if it incorporated into a
searchable database.
Associations.
Indexing is an analytic process of
determining which concepts are
worth indexing, what entry labels
to use, and how to arrange the entries. As such, Web indexing is
best done by individuals skilled in the craft of indexing,
either through formal training or through self-taught reading
and study.
Indexing is a list of words or phrases ('headings') and
associated pointers ('locators') to where useful material relating to that
heading can be found in a document or
collection of documents. Examples
are an index in the back matter of a book and an index that serves as a
library catalog.
A Web Index is often a browsable list of
entries from which the user makes selections, but it may be non-displayed
and searched by the user typing into a search box. A site
A-Z index is a
kind of Web index that resembles an alphabetical back-of-the-book style
index, where the index entries are hyperlinked directly to the appropriate
Web page or page section, rather than using page numbers.
Interwiki Links is a facility for creating
links to the many
wikis on the World Wide Web. Users avoid pasting in entire URLs (as they
would for regular web pages) and instead use a shorthand similar to links
within the same wiki (intrawiki links).
I'm like an isle in the internet
library. Organizing data out of necessity while making it a
value to others at the same time. Eventually connecting to other
human search engines around the world to expand its reach and
capabilities.
I like to describe my website
as being kind of like a lateral
Blog
then the usual
Linear Blog because I update
multiple pages at once instead of just one. As of 2010 around
120,000 new weblogs are being created worldwide each day, but of
the 70 million weblogs that have been created only around 15.5
million are actually active. Though blogs and
User-Generated Content are useful to some extent I feel that
too much time and effort is wasted, especially if the
information and knowledge that is gained from a blog is not
organized and categorized in a way that readers can utilize and
access these archives like they would do with newspapers. This
way someone can build knowledge based evidence and facts to use
against corruption and incompetence. This
would probably take a
Central Location for all the
blogs to submit too. This way useful knowledge and information
is not lost in a sea of confusion. This is one of the reasons
why this websites information and links will continue to be
organized and updated so that the website continues to improve.
"
Links
in a Chain"
"There's a lot you don't know, welcome to web 3.0" This
is not just my version of the internet, this is my vision of the
internet. And this is not philosophy, it's just the best idea that I have so far until I can
find something better to add to it, or replace it, or change it.
A Think Tank who's only major influence is
Logic.
"When an old man dies, it's like entire library burning down to the ground. But not for
me, I'll just back it up on the internet."
Search Engines
Organic Search Engine is a search engine that uses human
participation to
filter the
search results and assist users in
clarifying
their search request. The goal is to provide users with a limited number
of
relevant results, as opposed to traditional search engines that often
return a large number of
results that may or may not be
relevant.
Organic
Search is a method for entering one or a plurality of search
items in a single data string into a search engine. Organic search results
are listings on search engine
results pages that appear because of their
relevance to the search terms, as opposed to their being advertisements.
In contrast, non-organic search results may include
pay per click advertising.
Research -
Search Engine
TechnologySearch is to try to
locate or
discover
something, or try to establish the existence of something. The activity of
looking thoroughly in order to find
something or someone. An
investigation seeking
answers. The
examination
of alternative hypotheses.
Locate is to discover the
location of someone or something. To
determine the place of where something is. To find something by searching
or by examining. To determine or indicate the place, site, or the limits
of something by using an
instrument or by a survey.
Find
is to determine the
existence of something or
to determine the
fact of
something. The act of
discovering something or obtaining something through effort or
management or following an investigation. A productive insight. Come upon
something after searching to get it back or
recover the use of
something. To find the location of something that was missed or lost. Get
something or somebody for a specific purpose. To establish something after
a calculation, investigation, experiment, survey, or study. Be subject to
a specified treatment or analysis. Decide on and make a declaration about
something. Come to believe on the basis of emotion, intuition, or
indefinite grounds.
Answer.
Hybrid Search Engine is a type of computer search engine
that uses different types of data with or without ontologies to produce
the
algorithmically generated results based on web crawling. Previous
types of search engines only use text to generate their results. Hybrid
search engines use a combination of both crawler-based results and
directory results. More and more search engines these days are moving to a
hybrid-based model.
Plant Trees while you
Search the Web. Ecosia search engine has
helped plant almost 18
million trees.
Question and
Answers Format
Search Engine in computing is an
information retrieval system designed
to help find information stored on a computer system. The search results
are usually
presented in a list and are commonly called hits. Search
engines help to minimize the time required to find information and the
amount of information which must be consulted, akin to other techniques
for
managing information overload. The most public, visible form of a
search engine is a Web search engine which searches for information on the World Wide Web.
Indirection is the ability to
reference something using
a name,
reference, or container instead of the value itself. The most
common form of indirection is the act of
manipulating a value through its
memory address. For example, accessing a variable through the use of a
pointer. A stored pointer that exists to provide a reference to an object
by double indirection is called an indirection node. In some older
computer architectures, indirect words supported a variety of more-or-less
complicated addressing modes.
Probabilistic Relevance Model is a formalism of information retrieval
useful to derive ranking functions used by search engines and web search
engines in order to rank matching documents according to their relevance
to a given search query. It makes an estimation of the probability of
finding if a document dj is relevant to a query q. This model assumes that
this probability of relevance depends on the query and document
representations. Furthermore, it assumes that there is a portion of all
documents that is preferred by the user as the answer set for query q.
Such an ideal answer set is called R and should maximize the overall
probability of relevance to that user. The prediction is that documents in
this set R are relevant to the query, while documents not present in the
set are non-relevant.
Bayesian Network
is a probabilistic graphical model (a type of statistical model) that
represents a set of random
variables
and their conditional dependencies via a directed acyclic graph (DAG). For
example, a Bayesian network could represent the probabilistic
relationships between diseases and symptoms. Given symptoms, the network
can be used to compute the probabilities of the presence of various
diseases.
Bayesian
Inference is a method of statistical inference in which Bayes' theorem
is used to update the
probability for a hypothesis as more evidence or
information becomes available. Bayesian inference is an important
technique in statistics, and especially in mathematical statistics.
Bayesian updating is particularly important in the dynamic analysis of a
sequence of data. Bayesian inference has found application in a wide range
of activities, including science, engineering, philosophy, medicine,
sport, and law. In the philosophy of decision theory, Bayesian inference
is closely related to subjective probability, often called "Bayesian probability".
Search Aggregator
is a type of metasearch engine which gathers results from
multiple search
engines simultaneously, typically through RSS search results. It combines
user specified search feeds (parameterized RSS feeds which return search
results) to give the user the same level of control over
content as a
general
aggregator, or a person who
collects things.
Metasearch Engine or
aggregator, is a search tool that uses another
search engine's data to produce their
own results from the Internet. Metasearch engines take input from a user
and simultaneously send out queries to third party search engines for
results. Sufficient data is gathered, formatted by their ranks and
presented to the users.
Prospective Search
is a method of searching on the Internet where the query is given first
and the information for the results are then acquired. This differs from
traditional, or "retrospective", search such as search engines, where the
information for the results is acquired and then queried.
Multitask
Subject Indexing
is the act of describing or classifying a document by index terms or other
symbols in order to indicate what the document is about, to summarize its
content or to increase its findability. In other words, it is about
identifying and describing the subject of documents. Indexes are
constructed, separately, on three distinct levels: terms in a document
such as a book; objects in a collection such as a library; and documents
(such as books and articles) within a field of knowledge.
Search Engine Indexing collects, parses, and stores data to facilitate
fast and accurate information retrieval. Index design incorporates
interdisciplinary concepts from linguistics, cognitive psychology,
mathematics, informatics, and computer science. An alternate name for the
process in the context of search engines designed to find web pages on the
Internet is web indexing.
Indexing -
Amazing Numbers and Facts
Text
Mining also referred to as text data mining, roughly equivalent to
text analytics, refers to the process of deriving high-quality information
from text. High-quality information is typically derived through the
devising of patterns and trends through means such as statistical pattern
learning. Text mining usually involves the process of structuring the
input text (usually parsing, along with the addition of some derived
linguistic features and the removal of others, and subsequent insertion
into a database), deriving
patterns
within the structured data, and finally evaluation and interpretation of
the output. 'High quality' in text mining usually refers to some
combination of relevance, novelty, and interestingness. Typical text
mining tasks include text categorization, text clustering, concept/entity
extraction, production of granular taxonomies, sentiment analysis,
document summarization, and entity relation modeling (i.e., learning
relations between named entities). Text analysis involves information
retrieval, lexical analysis to study word frequency distributions, pattern
recognition, tagging/annotation, information extraction, data mining
techniques including link and association analysis, visualization, and
predictive analytics. The overarching goal is, essentially, to turn text
into data for analysis, via application of natural language processing
(NLP) and analytical methods. A typical application is to scan a set of
documents written in a natural language and either model the document set
for predictive classification purposes or populate a database or search
index with the information extracted.
Social Search is a behavior of retrieving and searching on a
social searching engine
that mainly searches user-generated content such as news, videos and
images related search queries on social media like Facebook, Twitter,
Instagram and Flickr. It is an enhanced version of web search that
combines traditional algorithms. The idea behind social search is that
instead of a machine deciding which pages should be returned for a
specific query based upon an
impersonal algorithm, results that are based on the human network of
the searcher might be more relevant to that specific user's needs.
Interactive Person to Person Search Engine
Brave Search API.
Power your search and AI apps with the fastest growing independent search
engine since Bing. Access an index of billions of pages with a single
call.
Gimmeyit Search Engine was a crowd-source-based search engine using
social media content to find relevant search results rather than the
traditional rank-based search engines that rely on routine cataloging and
indexing of website data. The crowd-source approach scans social media
sources in real-rime to find results based on current social "buzz" rather
than proprietary ranking algorithms being run against indexed sites. With
a crowd source approach, no websites are indexed and no storage of website
metadata is maintained.
Tagasauris -
Public Data
Selection-Based Search
is a search engine system in which the user invokes a search query using
only the mouse. A selection-based search system allows the user to search
the internet for more information about any keyword or phrase contained
within a document or webpage in any software application on his desktop
computer using the mouse.
Web Searching TipsRummage is to
search haphazardly through a jumble of things. Ransacking.
Web Portal is most often a specially designed web site that
brings information together from diverse sources in a uniform way.
Usually, each information source gets its dedicated area on the page for
displaying information (a portlet); often, the user can configure which
ones to display.
Networks -
Social Networks
Router
is a networking device that forwards data packets between computer
networks. Routers perform the traffic directing functions on the Internet.
A data packet is typically forwarded from one router to another through
the networks that constitute the internetwork until it reaches its
destination node.
Interface -
Computer -
Internet
-
Web of Life
Window to the World -
Open Source
A Human Search Engine also includes.
Archival Science
-
Archive -
Knowledge Management -
Library Science -
Information Science.
Reflective Practice
-
Research -
Science
Tracking -
Interdiscipline -
Thesaurus
Human-Based Computation
is a computer science technique in which
a
machine performs its function by outsourcing certain steps to humans,
usually as microwork. This approach uses differences in abilities and
alternative costs between humans and computer agents to achieve
symbiotic human-computer
interaction. In traditional computation, a human employs a computer to
solve a problem; a human provides a formalized problem description and an
algorithm
to a computer, and receives a solution to interpret. Human-based
computation frequently reverses the roles; the computer asks a person or a
large group of people to solve a problem, then collects, interprets, and
integrates their solutions.
Internet Searching Tips
Knowing how to ask a question,
knowing which questions to ask, and knowing how to analyze the answers,
may eventually find you an answer that is useful and worth saving.
When searching the
internet, you have to use more than one
search engine in order
to do a complete search. Using one search engine will narrow
your findings and may possibly keep you from finding what you're
looking for, because most search engines are not perfect and are
sometimes unorganized, flawed and manipulated. This is why I'm
organizing the Internet, because search engines are flawed and
thus cannot be fully depended on for accuracy.
Adaptive
Search
Example:
Using the same exact keywords on 4 different
search engines I found the website that I was looking for at the
top in the number one position, on 2 of the 4 search engines,
and I could not find that same website on the other search
engine unless I searched several pages deep. So one search
engine is flawed or manipulated and the other search engine is
not. There are chances that the webpage you are looking for is
not titled correctly so you may have to use different keywords
or phrases in order to find it. But even then this is no
guarantee because search engines also use other factors when
calculating the results for particular words or phrases. And
what all those other factors are and how they work is not
exactly clear.
When searching the Internet,
sometimes going
several pages deep on search engines will also help find
information, this is because the first 10 choices are sometimes
irrelevant. I have sometimes found things that I'm looking for
30 pages deep. You will also find different key words, phrases
and characters within the search results that may also help
increase your odds of finding what you're looking for. Sometimes
checking a websites links on their resources page may also help
you find websites that are not listed correctly in search
engines. Web Searching for Information needs to be a
science.
Human Search Engine Tips
Search engines are in fact a highly important
social service, just like a
congressman or
president, except not corrupted of course. If you honestly can not say exactly how and why you performed a
particular action, then how the hell are people supposed to
believe you or understand what they need to do in order to fix
your mistake, or at least, confirm there was no mistake?
Transparency,
truth and knowing the
facts for these particular services are absolutely
necessary. People have the right not to be part of a
blind experiment. These Systems need to be
open,
monitored and
audited in order for us to work accurately and efficiently.
Most search engines like Google have
advanced searching tools found on the side or at the bottom
of their search pages.
Knowing where to
type in certain characters in your search phrases also helps you
find what you're looking for.
If you want to limit your searches on
Google to only education websites or government websites
then type in "site:edu" or "site:gov" after your key word or
search phrase.
For example Teaching Mathematical Concepts site:edu
For searching a specific website type in "neutrino
site:harvard.edu after the word or search phrase.
To narrow your searches to file types like PowerPoint, excel or
pdf's then type in filetype:ppt after the word.
For search ranges use 2 periods between 2 numbers, like "Wii
$200
..$300."
Using quotes or a
+ or
- within your search
phrases. Example, imagine you want to find pages that have
references to both President Obama and President Bush on the
same page.
You could search this way:
+President Obama+President
Bush
Or if you want to find pages that have just President Obama and
not President Bush then your search would be
President Obama
-President
Bush.
If you are looking for sand
sharks search engines will give you results with the word sand
and sharks but if you use quotation marks around
"sand
sharks
" it will help narrow your search.
Using "~" (tilde)
before a search term yields results with related terms.
Regular Expression is a sequence of characters that
define a search pattern. Usually this
pattern
is used by string searching algorithms for "find" or "find and replace"
operations on strings, or for input validation. It is a technique that
developed in theoretical computer science and formal language theory.
Conversions try
typing "50 miles in kilometers" or 100 dollars in Canadian
dollars.
Use Google to do
math just enter a calculation as you would into your computer's
calculator
(i.e. * corresponds to multiply, / to divide, etc)
To find a time in a certain place, then type in Time: Danbury, Ct.
Just got a phone call and want to see where the call is from,
then Type in 3
digit # area code.
Type any address into Google's main search bar for maps and
directions.
While on
Google Maps select the day of the week and the time of day
for the traffic forecast.
What are people Searching for and what Key words are they using
Search Query Trends -
Google Insights Search Trends -
Google
Trends -
Google
-
Yahoo Alexa Web Trends
You can learn even more great search tips by visiting this website
Search Engine Watch.
Learning
boolean logic can also help with improving your Internet searching skills.
Boolean Operators (youtube)
If on a website and you're using the
Firefox browser, if you right click on the page, and then click
on "
Save Page As", it will save the entire page on your computer
so that can be view that page when you are off line, without the
need of an internet connection.
Algorithm Censorship
Google
censors search
results, while at the same time they killed thousands of small businesses, and
not only that, they influenced other people to censor information
and corrupt the system even more. Why do corporations get greedy and
criminal? And why do they cause others to repeat this
madness?
Money and
Power
is a cancer in the wrong hands. So the
Dragonfly censored secret search engine is nothing new. And on top of
that, people are getting bombarded with
robocalls because of google, and people have to visit websites littered
with adds because of google, which is abusive and one of the reasons why
google has been sued several times for millions of dollars. And the
lawsuits are not stopping these abuses.
The
google
algorithm works OK most of the time, but it is also used to
censor websites unfairly.
This is
corruption at its worst.
Evil charlatans finding
a new low, and
no one is safe.
Problems with Google -
Information Bubbles -
Googel Search Results
Google Fined $1.7 Billion by EU for Blocking Advertising Rivals.
Alphabet's Google was fined $1.7 billion by the European Union for
limiting how some websites could display ads sold by its rivals.
Penguin (wiki) -
EMD (wiki) -
Panda (wiki) -
Google Bomb (wiki)
Criticism of Google (wiki)
Life through
Google's Eyes. Google's instant autocomplete that automatically fills
in words and phrases with search predictions and suggestions. Sometimes
with disturbing results.
Search Engine Failures -
Algorithms
-
Search
Algorithm -
Human Search Engine
Internet -
Internet Safety
"If you are indexing information, that should be your focus.
If information is judged on irrelevant factors, then you will
fail to correctly distribute information, which will make
certain information in search results unreliable, illogical and corrupted."