The Swiftype Blog / Category: Site Search

ObjectIdColumns: Transparently Store MongoDB BSON IDs in a RDBMS

Here at Swiftype, we use both MongoDB and MySQL to store some of our core metadata — not search indexes themselves, but users, accounts, search engines, and so on. As we’ve migrated data from MongoDB to MySQL, we’ve found ourselves needing to store the primary keys of MongoDB documents in MySQL.

While it’s possible to use more-or-less arbitrary data in MongoDB as your _id, very, very frequently you will simply use MongoDB’s built-in ObjectId type. This is a data type similar in concept to a UUID; it can be generated on any machine at any time, and the chance it will be globally-unique is still extremely high. Some relational databases offer native support for UUIDs; we thought, why shouldn’t we teach Rails how to get as close to that ideal as possible with ObjectIds, too?

The result has been our objectid_columns RubyGem, which we are proud to release as open source under the MIT license. Using ObjectIdColumns, you can store MongoDB ObjectId values as a CHAR(24) or VARCHAR(24) (which stores the hexadecimal representation of the ObjectId in your database), or as a BINARY(12), which stores an efficient-as-possible binary representation of the ObjectId value in your database.

No matter how you choose to store this data, it’s automatically exposed from your ActiveRecord models as an instance of the bson gem’s BSON::ObjectId class, or the moped gem’s Moped::BSON::ObjectId class. (ObjectIdColumns is compatible with both equally; the two are extremely similar.)

my_model = MyModel.find(...)
my_model.my_oid # => BSON::ObjectId('52eab2cf78161f1314000001')

You can assign values as an instance of either of these classes, or as a String representation of an ObjectId — in either hex or pure-binary forms — and it will automatically translate for you:

my_model.my_oid = BSON::ObjectId.new # OK
my_model.my_oid = "52eab32878161f1314000002" # OK
my_model.my_oid = "R\xEA\xB2\xCFx\x16\x1F\x13\x14\x00\x00\x01" # OK

ObjectIdColumns even transparently supports queries; the following will all “just work”:

MyModel.where(:my_oid => BSON::ObjectId('52eab2cf78161f1314000001'))
MyModel.where(:my_oid => '52eab2cf78161f1314000001')
MyModel.where(:my_oid => 'R\xEA\xB2\xCFx\x16\x1F\x13\x14\x00\x00\x01'))

Enjoy! Head on over to the objectid_columns GitHub page for more details, or just drop gem ‘objectid_columns’in your Gemfile and go for it!

If you enjoyed the tips in this tutorial, make sure to bookmark our blog and subscribe for more announcements like our new Swiftype Ruby Gem.

Our Cloud Stack at Swiftype

Swiftype site search was featured as LeanStack’s service of the week. As part of that I wrote a guest blog post about how Swiftype uses cloud services to run our business.

“Implementing a better product with less hassle is really only half the advantage of using a service like ours. The other half — which doesn’t seem to get as much marketing play — is that by leveraging the product of a company dedicated to a single, specific technology, you realize the gains of having a full-time team of domain experts dedicated to improving your search feature, without assuming any of the cost. At Swiftype we spend all of our time thinking about, developing, and iterating on search, and every time we ship an improvement, all of our customers reap the benefits instantly. Our experience has shown that at most companies it can be a full-time job just maintaining an internal search system, much less improving it over time. When search isn’t a core competency of your company, we believe you’re better off letting us take care of the details. And of course the same philosophy applies to our company as well, which is why we leverage so many existing cloud-based services in our daily operations. Anywhere that we can save time and resources using a product that another company focuses their full effort on delivering is a win for us, because it allows us to spend our resources on what we do best — building great search software.”

Read the post to learn more about our cloud stack and the services we use.

If you liked this post, please remember to bookmark our blog and subscribe to our newsletter. We’ll be posting announcements and more from the Swiftype team, as well as our friends and partners who power their search with Swiftype, such as Laughing Squid.

How We Use Swiftype to Understand our Customers

Paul Graham’s advice to entrepreneurs is simple – “Make something people want.” Make being the easy part and what people want being the much harder part. In the startup world, there are several interesting techniques for figuring out what people want. Customer Development, User Surveys, Crowdsourced idea generation etc. However my recent favorite is Swiftype’s weekly analytics email. Let me explain.

Quickly See What People Are Searching For

The following screenshot is from Swiftype’s sample report:

Top searches by number of queries

The first section of the email let’s you see at a glance what your users are searching for. We use Swiftype to power our documentation search, so our search terms tell us what our users most need help with. The top search for us right now is “email.” This make senses because our users typically want to know how to setup email. The top few keywords gave us a good sense of what our users are looking for right after signing up and have helped us shape up our product tour.

Figure out What New Stuff to Build

The second section of the email is more interesting. You can see which searches returned no results at all:

Top searches with No Results

In our case, the missing searches could mean one of the two things: * A feature/functionality that we have but which is missing documentation. * A feature that we don’t have.

For us it’s mostly the latter. For example the top result for us in this category is “reports”, since we don’t have reporting yet (our early adopters did not care for it but we are working on it now). Using this feature we also realized that people are looking for integrations like Pivotal, JIRA etc. Based on this, we decided to work on a hosted app platform that we will be rolling out in a few weeks.

Either way, we learn exactly where we need to improve. It could be improvements to an existing feature (adding documentation, improving the UX) or ideas for new features. Used with other techniques like user interviews and analytics, Swiftype has really helped us improve our app. In the future, we plan on using Swiftype to power our app directory search so we can find out ideas for new apps. The same technique can be applied to your marketing site as well.

Sitemap.xml Support for Swiftype

At Swiftype we’re always working on new ways to improve the quality of the crawl of your website, and today we’re announcing Swiftype crawler support for the Sitemap.xml protocol.

The Sitemap.xml protocol is a well-documented and widely implemented standard for specifying exactly which set of URLs you would like web crawlers to index on your website, and if your website supplies a sitemap.xml file to our crawler we will dutifully follow your specifications as our crawler builds a search index for your website.

If you aren’t familiar with Sitemap.xml files, we’ll take you through a quick tutorial here, and there is additional information in our documentation section as well as the official protocol page.

To get started, create a simple sitemap.xml file. An example sitemap.xml that specifies 3 URLs might look as follows:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://www.yourdomain.com/</loc>
  </url>
  <url>
    <loc>http://www.yourdomain.com/faq/</loc>
  </url>
  <url>
    <loc>http://www.yourdomain.com/about/</loc>
  </url>
</urlset>

Next, you’ll put the sitemap.xml file on your web server at a location that is accessible by our crawler. Many sites place the sitemap at the root of the domain (i.e. http://www.yourdomain.com/sitemap.xml), but any location is fine. Whatever location you choose, you should specify the location in your Robots.txt file as follows:

User-agent: *
Sitemap: http://www.yourdomain.com/sitemap.xml

If you’re unfamiliar with the Robots.txt file, you can find more information at the official Web Robots page.

Once your robots.txt file is updated and your sitemap.xml file has been uploaded you’re finished. The next time the Swiftype crawler visits your website we’ll recognize your sitemap.xml file follow the links you specify.

As always, if you’re having trouble or want more information, feel free to get in touch. Also, don’t forget to follow the blog so you don’t miss out on great content from our friends like Bob Hiler from Mixergy.

Exclude Unwanted Content with Swiftype

Are there parts of your site you won’t want indexed? We’ve got you covered.

To exclude parts of your website by path, you can use Path Exclusions. You can exclude pages starting with, containing, or ending with the text you specify. For advanced users, we also support regular expression matches.

To add a path exclusion, click on a crawler-based engine, then select the Domains tab, then the domain to which you want to add path exclusions.

 

As you type your exclusion, we’ll show you a sample of the pages that will be removed from the index.

Once you’re happy with the exclusions, hit the Recrawl button to put them into effect.

On an individual page, you can exclude content (for example, your header or footer) by adding the data-swiftype-index attribute set to false.

Here’s an example:

An example page with content exclusion
  

 

This is your page content, which will be indexed by the Swiftype crawler.This content will be indexed, since it isn’t surrounded by an excluded tag.

 

By combining Path Exclusions and Content Exclusion, you can precisely control how your website is indexed by Swiftype.

As always, if you have trouble, please reach out.

Announcing Swiftype: Modern Search for Sites and Apps

Swiftype founders Matt Riley and Quin Hoxie

Today we are announcing Swiftype, the best way to add search to your site or app. Quin and I have long been frustrated by how hard it is to add good search to web sites and apps, so we decided to do something about it.

Swiftype has an API you can use to index arbitrary content, but we also can crawl your site so you can get started in minutes. We’ll create a search engine literally while you watch. You can install it on your site using a simple JavaScript embed and your users will enjoy great, fast search results and autocompletion. In addition, Swiftype lets you customize search results with drag-and-drop and gives you detailed analytics about the queries your users are making.

Swiftype is already powering search for customers like Twilio, TwitchTV, Listia, and Fastly.

If you’d like to hear more or discuss adding Swiftype search to your site or app, please reach out. Also, be sure to keep an eye on the Swiftype blog – we’ll be releasing frequent updates.

You can read more about our launch on TechCrunch, or give us a try.

Subscribe to our blog