Companies require the most recent business statistics on the various companies and markets they have a stake in. They must be aware of the competition and how to manage their businesses better. The only thing they lack is the time and resources to complete the task themselves.

As an Internet researcher, you will be the one who digs and finds this information for the companies in question. You also convert this data into formats that the company can use.

Career Information Involving Internet Research

When it comes to being an online researcher, there's a lot more to it than you might think. For example, did you know that web researchers make an average of $26.49 per hour? ZipRecruiter's equates to $55,104 per year!

Between 2018 and 2028, the internet research career is expected to grow by 20%, resulting in 139,200 job openings across the United States. According to

Internet research is becoming a daily task for many people. Some careers require it more than others. Below are great career options for people who like doing research online and want to do it full-time.

Job Title

Median Salary (2020)*

Job Growth (2019-2029)*

Freelance Researcher



Web Researcher



Technical Writer






Private Investigator

$53,320 (for all private detectives and investigators)

8% (for all private detectives and investigators)




Market Research Analyst



Source: *the U.S. Bureau of Labor Statistics &

The average hourly pay for a Freelance Researcher in the United States is $32.06 per hour as of August 3, 2021.

In North America, the average salary for a freelance Internet researcher is $62,000. However, it varies widely depending on the firm, region, sector, expertise, and benefits. According to, this compensation was computed using the average salary for all positions having the keyword "internet research specialists" somewhere in the job description. This pay is before the Association's professional researcher credentials are taken into account.

While ZipRecruiter lists hourly wages as high as $78.85 and as low as $7.45, the majority of Freelance Researcher incomes in the U.S. today fall between $16.83 (25th percentile) and $39.90. (75th percentile). The typical pay for a Freelance Researcher varies widely (by as much as $23.08 per hour), implying that there may be several prospects for growth and higher income dependent on skill level, location, and years of experience.

What is the Role of an Internet Researcher?

An internet researcher or freelance researcher does everything from finding information about companies, particularly contacts, such as e-mail addresses and phone numbers. All sorts of businesses and organizations use Internet researchers, including scientific organizations, magazines, and business firms.

Internet researchers utilize their skills to find information on a variety of topics on the Internet. In this field, you can work as a freelancer, contract employee, investigator, or direct employee of an organization. To find information, you must be able to use a search engine to access scholarly research, industry publications, and think tank documents and use the Internet as a primary tool for information gathering.

How to Become an Internet Researcher? Or Internet Research Specialist

Finding relevant and credible information requires specialized training and web search skills due to the sheer size of the World Wide Web and the rapid growth of indexed web pages.

Because the job of an Internet researcher requires the use of the Internet, the Association of Internet Research Specialists (AofIRS) states that he must first understand how to use the Internet, including how to access and display web pages in a web browser. What is the process through which a search engine reads and understands search keywords and phrases? Search engine intent interpretation of keywords and search phrases? How to choose keywords and create a search phrase that works etc.

You must also learn how to create complex search queries and search tools for the project you are working on. Data analysis is an essential part of an Internet researcher's job because you must comprehend and interpret the information you find.

To work as a web researcher, you don't need a degree. Instead, all you need to know is how to use web tools to swiftly and accurately gather reliable and truthful information. To grasp how keywords integrate into your ability to find information, you'll also need a fundamental understanding of search engine basics. It's also beneficial to become certified in specific research areas, so you know where to seek and whether or not a resource is reliable. A certification or degree may be helpful in this situation. A comprehension of market research and demographics data, as well as exceptional writing abilities, are required for this position.

What are the Most Reputable Online Research Certifications and Courses?

So far, the Certified Internet Research Specialist (CIRS) program has the most comprehensive curriculum and a well-rounded subject matter that meets industry standards. The Association of Internet Research provides the most reputable web research courses and certifications available and professional level certifications in web/internet research. A CIW course called CIW Internet Business Associate is another option. Some universities offer short online search courses, but the majority are geared toward librarians.

Certified Internet Research Specialist (CIRS™) Training Program

  • CIRS™ is the first of its kind - A Gold Standard Certification for Practitioners of Online Research. And a globally recognized professional certificate program offered by AofIRS. The Association is a Self-regulatory body governed by the "Charter of Associations” granted by the Government of Canada.
  • The CIRS training guide 5th edition course is aligned with the new 2021 curriculum, and It is best suited as a Practical Guide or Reference for anyone who uses the Internet as a primary tool for information gathering.
  • In this comprehensive course, you will learn about the practical approach to building complex search queries and search tools. If you become a CIRS, you are expected to be knowledgeable in selecting the best search tools and be proficient in building Internet search queries. You would develop sufficient knowledge of the Internet Laws and related ethical issues.
  • Learners who successfully pass the exam receive the Certified Internet Research Specialist (CIRS) certification, which is an industry-recognized certification that comes with certified membership benefits. Earn a higher salary after receiving hands-on training as an Internet Research Specialist.
  • Improve Your Search Skills and Advance Your Career in the Future Skill of Online Research. Increase your network strength by getting to know industry experts from all over the world.
  • This program is available in both Self-study and Online Training Classes. No prior knowledge or experience in research is required to take this course.

CIW Certified’s Internet Business Associate Training

  • Internet Business Associate Training is a broad course that covers a wide range of topics in today's web-based business environment.
  • Among other things, the course teaches learners about databases and their relationships to search engines, as well as how to use search engines to carry out basic and advanced web research.
  • You will learn how to connect to the Internet, Internet protocols, the Domain Name System (DNS), and cloud computing. The fundamental functions of Web browsers, the components of Web addresses, the use and management of cookies, and the use of browsers in the business world.
  • Upon successfully passing and completing the course, learners receive the CIW Internet Business Associate certification, which is an industry-recognized certification that comes with salary perks.
[Source: Uploaded by the Association Member: Erin R. Goodrich] 
Published in Online Research

Internet Research is "a practice of careful and diligent search of relevant and reliable online information, especially free information on the World Wide Web."

How to Become a CIRS™ (Certified Internet Research Specialist)?

Becoming an Internet Research Specialist requires specialized training and skill in performing effective and efficient Web searches with an acceptable level of accuracy. A Certified Internet Research Specialist, CIRS™ is considered a Gold Standard Training Program offered by the Association of Internet Research Specialists (AOFIRS) for the high-in-demand Internet-driven knowledge industry. 

Essential Learning Areas for CIRS Certification

To meet today's digital challenges, mainly where the Internet is a primary tool for information procurement, the AOFIRS considers four Essential Learning Areas (ELA) in bestowing CIRS™ credentials.

  1. The Internet Technology – This module answers common questions of what constitutes a Web site? How do Web Applications work on the Internet? And how do a Web Browser and Search Engines work together to deliver Web pages?
  1. Research Methods and Online Research Techniques – This learning module teaches some conventional research methods and standards applied in the Research Industry. Students learn techniques to perform Web searches with advanced search queries and get introduced to essential online research tools to conduct specialized research on the Internet.
  1. Research in Business and Business of Research – Students learn of the information needs of businesses and get familiarized with various research reports used by the management, shareholders, and lenders. This module discusses the essentials for establishing and running a successful business for those looking to run their independent research practice. It covers market opportunities and client engagement documentation.
  1. Internet Law and Ethics - This learning module covers areas of Internet laws that provide legal protections to researchers that use Internet information resources in their work. Students learn About Internet Jurisdiction, Intellectual Property Laws, CyberCrimes, Defamation, Privacy, and E-commerce-related laws. This curriculum discusses the topic of ethical issues attached to human subjects related to online data procurement.

This course is designed to serve as a self-study training guide to help you prepare for the CIRS™ examination. The book is most suitable as a Practical Guide for mastering the art of Online Research by anyone that uses the Internet as a primary tool for information gathering - The subjects covered in this book will help you build complex search queries using practical search methods and tools. See the CIRS Certification Syllabus for additional information.

CIRS™ Qualifying Examinations

A Certified Internet Research Specialist, CIRS™ exam is given online as four separate exam modules. Students attempt exam modules separately for each module, at their own pace and time of choosing. The table below shows the number of questions asked for individual exams and the time to finish.

cirs training course

  • These are Online Exams available to students 24x7 from anywhere the Internet is available.
  • Exam results are available immediately after the student finishes all answers to exam questions.
  • Students need to score 75% points to pass each exam module. Each question carries one point. All answered questions must be correct in full and not partially correct to score a whole point.
  • There are no limitations to the number of attempts on each exam taken online.

Preparing for CIRS™ Qualifying Examinations

Preparing for the CIRS™ requires a commitment of time and a focused study plan for this challenging exam. Based on our instructor's experience teaching the course and feedback from students taking self-study preparation plans,  an average student needs more than 35-40 hours to prepare for the CIRS™ exam. Even this amount of preparation is no guarantee that a student will pass on the first try.

AOFIRS offers two approaches to preparing for CIRS Certification.

  1. A CIRS Self-study plan for students preparing with the AOFIRS approved self-learning training material. (CIRS Training Guide 5th Edition)
  2. An online training class that the AOFIRS conducts with trained instructors.

AOFIRS publishes its official CIRS™ training material, including e-books, reference guides, notes,  PowerPoint presentations, recorded lectures, practice exam questions, and answers.        

What are the Topics Covered in CIRS™ Course?

AOFIRS offers the most comprehensive and well-rounded curriculum today that prepares modern researchers to meet new Internet information use and procurement challenges. An outline of topics in concise format show in the illustrated diagram below.

cirs certification course

Download CIRS™ Certification Syllabus

Who can use CIRS Certification?

The CIRS Certification is used by any one looking to search the Internet for Free Information sources and find valuable online databases that provide reliable and relevant information. The CIRS certified professional include Professional Services Provider (such as, Independent Contractors, Freelance Researcher, Web Researchers, Lawyers, PI's, Financial, Analysts, Writers Etc.), Corporate Executives, Entrepreneurs, and Govt and Non-profit Associations. 

Published in Online Research

The majority of links are considered to be commercial in nature, according to new research.Dan Petrovic, aka @DejanSEO, has just published the results of a quantitative study of 2,000 web users in the US and Australia. It was set up to discover perceptions about why web publishers link out.

Accordingly to the research, more than 40% of users think that outbound links from one web page to another are there because they generate revenue for the publisher.

‘Marketing Advertising & Revenue’ was seen to be the number one reason why a link exists, with almost a third of users expecting there to be some kind of commercial arrangement in place.

‘Promotion, Relationship & Sponsorship’ was chosen by a further 9% of respondents. Money, money, money.Meanwhile, just one in five people recognised links as organic citations to help stand up the information on a web page.

outbound links study 2016

All in all, the analysis of the results found that more than half of links exist for commercial reasons, with only 34% seen to be non-commercial.

Classifying the different types of link

I really like Dan’s classification of links, which now straddles 10 distinct areas (though there is a good amount of cross-over). They are as follows:











Further details on each of these link types can be found here. For example, you might file that outbound link under ‘expansion’, because it’s there for further reading and insight into this topic.Dan makes the point that since many of these link types overlap, it can be hard to spot the true intent as to why a link exists.

Some links that look natural – and which are genuinely useful – might actually be there because of some business or personal relationship. That doesn’t automatically make them sketchy. It’s just human nature.

Of course Google doesn’t necessarily see it that way. Many people fear the dreaded manual penalty and go the extra mile to neuter links, even when they have perfectly valid reasons to point visitors to their friends and siblings.

Dan says:

“I see a lot of websites nofollow links to their partner websites, sister companies and various other forms of affiliation because they were told to do so by their SEO or even someone in Google’s webspam team.

“This sort of madness has to stop. If commercially-driven links exist on the web organically then they’re organic in nature and shouldn’t be treated as ‘clean-up material’ nor should those links be penalty-yielding.”

Hear, hear..

I’d love to know the gap between how users perceive links and the actual reasons why the author / publisher put them in place. Presumably it is quite large…


Published in Online Research

Twenty-seven years ago, Tim Berners-Lee created the World Wide Web as a way for scientists to easily find information. It has since become the world's most powerful medium for knowledge, communications and commerce — but that doesn't mean Mr. Berners-Lee is happy with all of the consequences.


''It controls what people see, creates mechanisms for how people interact,'' he said of the modern day web. ''It's been great, but spying, blocking sites, repurposing people's content, taking you to the wrong websites — hat completely undermines the spirit of helping people create.''


So on Tuesday, Mr. Berners-Lee gathered in San Francisco with other top computer scientists — including Brewster Kahle, head of the nonprofit Internet Archive and an internet activist — to discuss a new phase for the web.


Today, the World Wide Web has become a system that is often subject to control by governments and corporations. Countries like China can block certain web pages from their citizens, and cloud services like Amazon Web Services hold powerful sway. So what might happen, the computer scientists posited, if they could harness newer technologies — like the software used for digital currencies, or the technology of peer-to-peer music sharing — to create a more decentralized web with more privacy, less government and corporate control, and a level of permanence and reliability?


''National histories, the story of a country, now happen on the web,'' said Vinton G. Cerf, another founder of the internet and chief internet evangelist at Google, in a phone interview ahead of a speech to the group scheduled for Wednesday. ''People think making things digital means they'll last forever, but that isn't true now.''


The project is in its early days, but the discussions — and caliber of the people involved — underscored how the World Wide Web's direction in recent years has stirred a deep anxiety among some technologists. The revelations by Edward J. Snowden that the web has been used by governments for spying and the realization that companies like Amazon, Facebook and Google have become gatekeepers to our digital lives have added to concerns.


On Tuesday, Mr. Berners-Lee and Mr. Kahle and others brainstormed at the event, called the Decentralized Web Summit, over new ways that web pages could be distributed broadly without the standard control of a web server computer, as well as ways of storing scientific data without having to pay storage fees to companies like Amazon, Dropbox or Google.



Efforts at creating greater amounts of privacy and accountability, by adding more encryption to various parts of the web and archiving all versions of a web page, also came up. Such efforts would make it harder to censor content.


''Edward Snowden showed we've inadvertently built the world's largest surveillance network with the web,'' said Mr. Kahle, whose group organized the conference. ''China can make it impossible for people there to read things, and just a few big service providers are the de facto organizers of your experience. We have the ability to change all that.''


Many people conflate the internet's online services and the web as one and the same — yet they are technically quite different. The internet is a networking infrastructure, where any two machines can communicate over a variety of paths, and one local network of computers can connect with other networks.


The web, on the other hand, is a popular means to access that network of networks. But because of the way web pages are created, managed and named, the web is not fully decentralized. Take down a certain server and a certain web page becomes unavailable. Links to pages can corrode over time. Censorship systems like China's Great Firewall eliminate access to much information for most of its people. By looking at internet addresses, it is possible for governments and companies to get a good idea of who is reading which web pages.


In some ways, the efforts to change the technology of creating the web are a kind of coming-of-age story. Mr. Berners-Lee created the World Wide Web while working at CERN, the European Organization for Nuclear Research, as a tool for scientists. Today, the web still runs on technologies of the older world.


Consider payments. In many cases, people pay for things online by entering credit card information, not much different from handing a card to a merchant for an imprint.


At the session on Tuesday, computer scientists talked about how new payment technologies could increase individual control over money. For example, if people adapted the so-called ledger system by which digital currencies are used, a musician might potentially be able to sell records without intermediaries like Apple's iTunes. News sites might be able to have a system of micropayments for reading a single article, instead of counting on web ads for money.


''Ad revenue is the only model for too many people on the web now,'' Mr. Berners-Lee said. ''People assume today's consumer has to make a deal with a marketing machine to get stuff for 'free,' even if they're horrified by what happens with their data. Imagine a world where paying for things was easy on both sides.''


Mr. Kahle's Internet Archive, which exists on a combination of grants and fees from digitizing books for libraries, operates the Wayback Machine, which serves as a record of discontinued websites or early versions of pages.


To make that work now, Mr. Kahle has to search and capture a page, then give it a brand new web address. With the right kind of distributed system, he said, ''the archive can have all of the versions, because there would be a permanent record located across many sites.''


The movement to change how the web is built, like a surprising number of technology discussions, has an almost religious dimension.


Some of the participants are extreme privacy advocates who have created methods of building sites that can't be censored, using cryptography. Mr. Cerf said he was wary of extreme anonymity, but thought the ways that digital currencies permanently record transactions could be used to make the web more accountable.


Still, not all the major players agree on whether the web needs decentralizing.


''The web is already decentralized,'' Mr. Berners-Lee said. ''The problem is the dominance of one search engine, one big social network, one Twitter for microblogging. We don't have a technology problem, we have a social problem.''


One that can, perhaps, be solved by more technology.





Published in Online Research

Cyberspace is not like your library

Librarians have a weird sense of humor. This is now an old joke: The internet is like a library with no catalog where all the books get up and move themselves every night...This was the state of the internet up until 1995 or thereabouts. Finding anything on the internet required comic strip characters like Archie, Veronica and Jughead, and generally you were the one who ended up feeling like a jughead when you rooted around for hours and still came up dry.

The new joke is:

The internet is like a library with a thousand catalogs, none of which contains all the books and all of which classify the books in different categories—and the books still move around every night. The problem now is not that of "finding anything" but finding a particular thing. When your search term in one of the popular search engines brings back 130,000 hits, you still wonder if the one thing you're looking for will be among them.

This can be an enormous problem when you're trying to do serious research on the internet. Too much information is almost worse than too little, because it takes so much time to sort through it to see if there's anything useful. The rest of this section will give you some pointers to help you become an effective internet researcher.

Get to know the reference sources on the internet

Finding reference material on the Web can be a lot more difficult than walking into the Reference Room in your local library.

The subject-classified Web directories described below will provide you with your main source of links to reference materials on the Web. In addition, many public and academic libraries, like the Internet Public Library, have put together lists of links to Web sites, categorized by subject. The difficulty is finding Web sites that contain the same kind of substantive content you'd find in a library. See the section on Reference Sources on the Web for a list of some Web-based reference materials, but please read Information found—and not found—on the Web to understand why it's different from using the library.

Understand how search engines work

Search engines are software tools that allow a user to ask for a list of Web pages containing certain words or phrases from an automated search index. The automated search index is a database containing some or all of the words appearing on the Web pages that have been indexed. The search engines send out a software program known as a spider, crawler or robot. The spider follows hyperlinks from page to page around the Web, gathering and bringing information back to the search engine to be indexed.

Most search engines index all the text found on a Web page, except for words too common to index, such as "a, and, in, to, the" and so on. When a user submits a query, the search engine looks for Web pages containing the words, combinations, or phrases asked for by the user. Engines may be programmed to look for an exact match or a close match (for example, the plural of the word submitted by the user). They may rank the hits as to how close the match is to the words submitted by the user.

One important thing to remember about search engines is this: once the engine and the spider have been programmed, the process is totally automated. No human being examines the information returned by the spider to see what subject it might be about or whether the words on the Web page adequately reflect the actual main point of the page.

Another important fact is that all the search engines are different. They each index differently and treat users' queries differently (how nice!). The burden is on the searcher to learn how to use the features of each search engine. See the links to Search Engines and to sources which have done Evaluations of the various features of Web directories and search engines.

See the Web and internet tutorials in the Links section for online articles about search engines.

Know the difference between a search engine and a directory

A search engine like Google or Hotbot lets you seek out specific words and phrases in Web pages. A directory is more like a subject index in the library—a human being has determined the main point of a Web page and has categorized it based on a classification scheme of topics and subtopics used by that directory. Some examples of directories are Yahoo! and the Internet Public Library. Many of the search engines have also developed browsable directories, and most of the directories also have a search engine, so the distinction between them is blurring.

See the links to Web directories and to sources which have done Evaluations of the various features of Web directories and search engines.

Consult the reference librarian for advice

Reference librarians can often be of great help in planning your internet research. Just as they know their library's collection, they probably have done a lot of research on the internet and know its resources pretty well. They're also skilled at constructing search terms and using search engines, and they're trained to teach others how to search.

Learn about search syntax and professional search techniques

To be successful at any kind of online searching, you need to know something about how computer searching works. At this time, much of the burden is on the user to intelligently construct a search strategy, taking into account the peculiarities of the particular database and search software. The section on Skills for online searching will help.

Learn some essential browser skills

Know how to use your browser for finding your way around, finding your way back to places you've been before and for "note-taking" as you gather information for your paper. A large part of effective research on the Web is figuring out how to stay on track and not waste time—the "browsing" and "surfing" metaphors are fine for leisure time spent on the Web, but not when you're under time pressure to finish your research paper. Lots of colleges have Netscape tutorials - see Web and internet tutorials for links which will supplement the information below.


Understand the construction of a URL.

Sometimes a hyperlink will take you to a URL such as You should know that the page "howto.html" is part of a site called "" If this page turns out to be a "not found" error, or doesn't have a link to the site's home page, you can try typing in the location box "" or "" to see if you can find a menu or table of contents. Sometimes a file has been moved or its name has changed, but the site itself still has content useful to you—this is a way to find out.

If there's a tilde (~) in the URL, you're probably looking at someone's personal page on a larger site. For example "" refers to a page at where J. Jones has some server space in which to post Web pages.


Be sure you can use your browser's "Go" list, "History" list, "Back" button and "Location" box where the URL can be typed in. In Web research, you're constantly following links through to other pages then wanting to jump back a few steps to start off in a different direction. If you're using a computer at home rather than sharing one at school, check the settings in your "Cache" or "History list" to see how long the places you've visited will be retained in history. This will determine how long the links will show as having been visited before (i.e, purple in Netscape, green in our site). Usually, you want to set this period of time to cover the full time frame of your research project so you'll be able to tell which Web sites you've been to before.

Bookmarks or favorites

Before you start a research session, make a new folder in your bookmarks or favorites area and set that folder as the one to receive new bookmark additions. You might name it with the current date, so you later can identify in which research session the bookmarks were made. Remember you can make a bookmark for a page you haven't yet visited by holding the mouse over the link and getting the popup menu (by either pressing the mouse button or right clicking, depending on what flavor computer you have) to "Add bookmark" or "Add to favorites." Before you sign off your research session, go back and weed out any bookmarks which turned out to be uninteresting so you don't have a bunch of irrelevant material to deal with later. Later you can move these bookmarks around into different folders as you organize information for writing your paper—find out how to do that in your browser.

Printing from the browser

Sometimes you'll want to print information from a Web site. The main thing to remember is to make sure the Page Setup is set to print out the page title, URL, and the date. You'll be unable to use the material if you can't remember later where it came from.

"Saving as" a file

Know how to temporarily save the contents of a Web page as a file on your hard drive or a floppy disk and later open it in your browser by using the "file open" feature. You can save the page you're currently viewing or one which is hyperlinked from that page, from the "File" menu or the popup menu accessed by the mouse held over the hyperlink.

Copying and pasting to a word processor

You can take quotes from Web pages by opening up a word processing document and keeping it open while you use your browser. When you find text you want to save, drag the mouse over it and "copy" it, then open up your word processing document and "paste" it. Be sure to also copy and paste the URL and page title, and to record the date, so you know where the information came from.

Be prepared to cite your Web references

Find out what form of bibliographic references your instructor requires. Both the MLA and APA bibliographic formats have developed rules for citing sources on CD-ROM and the internet. Instructions for Citing Electronic Sources are linked from the Internet Public Library.


Published in Research Methods

The following are a few of the techniques and tools I use to make my Google searching more effective or more productive.

Synonym Searching

Google has a limit of 10 words per search [since expanded to 32], which can make it difficult to include all the possible variations on a word. For example, a search for reports on childhood obesity should probably also include the words child, children, kid, kids, youth and family as well as childhood, and the words obese, overweight and fat as well as obesity. Oops! That adds up to 11 possible search terms, and doesn't give you any leeway to include filetype: limitations or other words to narrow the search down to reports. One way to circumvent this limitation is to try Google's synonym search. Add a tilde (~) at the beginning of the words child and obese (~child ~obese), and Google retrieves web sites that use any of those synonyms.

A slider bar lets you specify how much you want the search results sorted by those interests you specified.

Note that this tool works best for common words, and some of the synonyms may be broader than you wish. I needed to search for web sites of elementary school bands, music departments and choirs. I tried a search for ~music, but saw that I was also getting web sites with the words rock, MP3, radio, audio, song, sound, and records -- not really what I had in mind.

Google Personalized

Personalized Google is still in beta, but it's an interesting tool. Once you go to the Google Labs page and select Personalized, you will be sent to a new search page, that includes a link to [Create Profile]. You can specify the type of searching you typically do, ranging from biotech and pharmaceuticals to dentistry to classical music. Click [Save Preferences], and then type your search terms in the Google Personalized search box.

At the search results screen, you will now see something new -- a slider bar that lets you specify how much you want the search results sorted by those interests you specified. The default is minimal personalization; move the slider bar toward maximum, and you will see the search results change on the fly, as Google re-ranks the results based on your personal interests.

Keep in mind that this personalization is only available through the Personalized Google page. If you go to the main Google search page, the personalization option is not available.

Google Shortcuts

As with other search engines, Google has some built-in "answer" features that can sometimes come in handy.

If you type the word "define:" and a word (define:card for example), instead of the usual search results, you will get definitions of that word from a wide range of glossaries, dictionaries and lexicons.

Type a US company's name or stock symbol in the search box, and the first item in the search results page will be a link to current stock quotes for that company, provided by Yahoo Finance.

Type a US area code in the search box, and the first search result will link to a map showing the general coverage area of that area code. I find this particularly useful now that there are over 200 area codes.

See for a list of Google's shortcuts.

Specialized Searches

In addition to the well-known Google search tabs for searching the web, news and images, there are several specialized search tools for commonly-search subjects, including UncleSam for searching federal government information; University Search for searching within the sites of major colleges or universities; and even Google Microsoft, for searching Microsoft-related sites.


Published in Search Engine

Several weeks ago, a legal advocacy group issued a press release, which informed about the organization's efforts on behalf of teenage girls who had been abused in a detention center. It referred readers to a redacted document on its Web site for more information.

As the mother of a teenage girl, it sparked my interest. I displayed the redacted PDF document, and then examined its security. Since I was able to discover the names of the girls, I informed the group, who quickly corrected the flawed document.

But what if my motives were not that of a curious and outraged parent?

Stories about improperly redacted documents appear frequently in the news and legal literature. Often, those who discover the redacted information expose it. But the motives of researchers run the gamut from mild curiosity to winning at all costs. Thus, while exposure might not be desirable, use of the information without the creator's knowledge or consent could be worse.

As was the case in this example, such findings often involve serendipity. But luck isn't always a factor. Strategy plays a major role in certain types of research; for instance, competitive intelligence. It behooves companies to learn about these techniques in order to protect their confidential information.

Private - Keep Out!

When researchers want to know something about a company, one of the first places they check is its Web site. They read what the company wants them to know. Then, if they want to dig deeper continuing to use the company itself as a source, they check two things: the Internet Archive and the Web site robots exclusion file (robots.txt).

The former archives previous versions of the site. As I relate in an earlier article, these sometimes shed light on information the company might not want to reveal.

Because of improved security at Web sites, robots exclusion files generally are not as helpful as they used to be. But researchers still check them, and so should you.

The files contain commands that instruct search engines about areas of the site they should not index. Any legitimate search engine will obey these commands.

To work correctly, the file must appear in the root directory of the Web site. It must bear the filename, robots.txt. Therefore, to find it, you enter:

They are easy to read. The one on The Virtual Chase looks, in part, like this:

user-agent: *
disallow: /_private/
disallow: /cas/
disallow: /cir/
disallow: /data/

The user-agent is the targeted crawler (search engine). The asterisk is a wildcard. Each character string following the command, disallow, is a subdirectory. Consequently, this abbreviated set of commands tells all search engines not to crawl the subdirectories labeled, _private, cas, cir and data. A researcher, of course, will choose to attempt entry, or not.

It's like placing a Keep Out sign on a door. If the door isn't locked, someone may walk through it.

Careless Clues

As I explain in the above-referenced article on the Internet Archive, a prospective client approached a group of my firm's lawyers about launching a new business in an industry with an unsavory reputation. One of the conditions for considering representation was that the woman not have prior dealings in the industry. She claimed she did not.

Research at the client's business Web site in the Internet Archive, however, uncovered circumstantial evidence of several connections. Through telephone research and public records, we were able to verify that not only was she working in the industry, she was the subject of a then-active federal criminal investigation.

Clues about information you would rather researchers not discover often come from the company itself. In a recent and widely publicized example, Google inadvertently released information about its finances and future product plans in a PowerPoint presentation.

Searching for Microsoft Office files is, in fact, an expert research strategy because the meta data often reveals information the producer did not intend to share. You may tack on a qualifier or use a search engine's advanced search page to limit results to specific file types, such as Word documents (doc), PowerPoint presentations (ppt) or Excel spreadsheets (xls).

At Google, the qualifier is filetype: whereas at Yahoo it is originurlextension:. Enter the file extension immediately after the colon (no spaces). Check each search engine's help documentation for the appropriate qualifier, or consult a Web site, such as Search Engine Showdown, which tracks and informs about such commands.

Searching certain phrases sometimes produces intriguing results. Try the phrases below individually to discover the potential for this technique when coupled with a company, organization or agency name:

"not for public dissemination"

"not for public release"

"official use only" (variations include FOUO and U//FOUO)

"company confidential"

"internal use only"

You might find additional ideas for searching dirty in the Google Hacking Database.

Copyright Underground

Book search engines, such as's Search-Inside-This-Book, Google Book Search and the Text Archive at the Internet Archive, are becoming increasingly valuable in research. If you uncover even a snippet of relevant information online, it may save you valuable research time offline.

One of my recent success stories involves finding an entire chapter on the target company in a book published just a few months prior to the research. Of course, I was unable to read the book online. I had to purchase it. But the tools helped me find what I might have missed without them.

However, this is not the underground to which I refer. By using these tools, you are not skirting the process for rewarding those who wrote and published the book.

The underground, while eminently real, is not so much a place as it is a mindset - one that sets information free. The result is a mixed bag of commercial products, including books, music, digital artwork, movies and software, that have been copied or reverse engineered.

Try the search strategy below. Replace the phrase, Harry Potter, with the keywords of your choice:

"index of" "last modified size description" "parent directory" "harry potter"

The portion of the search statement preceding "harry potter" comprises a strategy for finding vulnerable Web sites or servers. In a nutshell, it commands the search engine to return matches to directory structures instead of single Web pages. If a Web site is properly secured, the search engine will be unable to provide this information.

To some extent, you can monitor the availability of files that comprise unauthorized copies of products

by setting up what Tara Calishain calls information traps. Tara's excellent book on Information Trapping provides many examples of ways to monitor new information.

One possibility is to use the above search strategy for best-selling or popular products, and then set up a Google Alert for new matches to each query.

While you should monitor hits at other search engines besides Google, doing so requires more work. First, test and perfect the query so that you are retrieving useful results. Set the search engine preferences to retrieve 100 items per page. Then copy the URL when the search results display. Paste it into a page monitor, such as Website-Watcher or TrackEngine. The tracking software or Web service will monitor changes in the first 100 search results. You may opt to have it send the changes to you by e-mail.

Companies and other organizations that want to protect proprietary or confidential information should conduct this type of research with regularity. You can expedite some of the search process with information traps. But considering the stakes, regular thorough searching is a worthwhile investment.


Published in Research Methods


World's leading professional association of Internet Research Specialists - We deliver Knowledge, Education, Training, and Certification in the field of Professional Online Research. The AOFIRS is considered a major contributor in improving Web Search Skills and recognizes Online Research work as a full-time occupation for those that use the Internet as their primary source of information.

Get Exclusive Research Tips in Your Inbox

Receive Great tips via email, enter your email to Subscribe.