The Swiftype Blog / Category: Developer

Salesforce Selects Swiftype as a Launch Partner for Federated Search

Swiftype has been chosen by Salesforce as a launch partner for its Summer 2017 Federated Search release. Swiftype’s proven Enterprise Search platform provides Salesforce users the ability to find and access content stored outside of Salesforce from within Salesforce Classic, Salesforce Console, or Lightning Experience. Regardless of where content lives, users can search for documents in Dropbox and Google Drive, content living in Confluence spaces, Jira tickets, Slack and Gmail conversations, and many other files critical to their work, without ever leaving Salesforce.

Swiftype’s extensive connector framework includes Box, Sharepoint, Slack, Google G Suite, Jira, Confluence, Evernote, Dropbox and more. For a full list, view here.  

Companies and their employees benefit greatly from the extension of Swiftype’s best-in-class artificial intelligence (AI), natural language processing (NLP) and machine learning (ML) technologies into Salesforce Federated Search. These technologies help identify a user’s true intent for every search, and apply refinements to filter content from external sources, for a precise and desired result output.  

With Swiftype’s search engine powering Salesforce Federated Search, employees are able to search once and see results across all their external data sources in the same Salesforce results page. Swiftype’s backend technology better understands queries and prioritizes content sources based on the search queries themselves, historical search behaviors, and even the individual’s role and most-used sources. For example, support reps work on cases and often reference articles or documentation stored in apps like Confluence or Sharepoint, so, as they search, the system learns to prioritize results from those applications. Likewise, Swiftype’s algorithms learn and understand that sales reps work most in accounts and opportunities and often need to locate contracts or collateral — so it surfaces data stored in Dropbox, Google G Suite or Box. Allowing search to crawl outside data sources and populate results within a Salesforce user’s existing workflow is paramount to maximizing employee productivity and efficiency.

GETTING STARTED

Salesforce users can get started with Salesforce Federated Search by contacting Swiftype here or by signing up for a free 14-day trial.  More details on implementation can be found on the Salesforce blog here.

HOW IT WORKS

Users can launch Swiftype Enterprise Search several different ways from within Salesforce:

1) Directly from the search bar (see Figure 1)
2) From an embedded Swiftype Search tab (see Figure 2)
3) By using an installed Swiftype browser extension (see Figure 3)

Figure 1 (search bars queries will bring you to a results page like this)

 

Figure 2 (search within a designated Swiftype tab in your Salesforce instance)

 

Figure 3 (Swiftype’s browser extension populates results without disrupting your workflow)

 

The Heat is On for Software Developers: Enterprise Search Can Help

Most of us can relate to something coder Sean Hickey posted to his Medium account a few years back. Here’s the gist of it: In Year One, they stick to a very tight and succinct coding style. As the years progress, most coders take things to higher and higher levels, writing longer and more detailed strings of code. Eventually, they become so experienced and efficient that by Year Ten, they’re back to square one, writing short, sweet programs that work like more complex ones.  

But one of his commenters made me laugh out loud, with this response: “The Twentieth Year: “Hey John, can you write me that Hello World program? I need it by tomorrow.”’

Yup, we’ve all been there. The pace of everything in business has increased over the past few decades, and software developers and engineers are not immune to its effects. How often have you heard, “I need it yesterday!”

This Ain’t Your Grandad’s IT Department: Supply and Demand Rules

Higher consumer demand has led to an evolution in all aspects of business. For the modern software developer this means shorter project cycles, improved software tools, a higher focus on team collaboration, and the more prolific use of open source software. And while software development tools have improved, there are now also a lot more of them, and each one does something different.

In fact, the number and types of apps being built today is more than has been built in the last 40 years. Today, there’s an open source library for pretty much everything—and if you can’t find what you need, you create it.

Collaboration Is A Double-Edged Sword: Increased Efficiency AND Increased Data

Project timelines have sped up and collaboration has increased. Whereas most coding projects were solo endeavors 20-25 years ago, today most enterprises function in a highly collaborative manner. Projects filter through many departments and cycles are measured in minutes and hours rather than days and weeks.

Creating “one size fits all” applications from scratch is no longer optimal due to the fact that they can be complex, drawn out, poorly designed, and take years to complete. Instead, developers now look to open-source libraries to create applications that can easily integrate with other solutions, including third-party SaaS services. And while APIs make it simpler to complete these integrations, that also means the numbers of APIs to keep track of and systems to monitor have grown exponentially.

The best part of this “ease of integration” is that it has opened up new sharing capacity: Whereas systems at one time were focused on a centralized database (such as with desktop software), today aggregating to the cloud is the norm, and software is being designed to be easily shared and widely distributed, mostly due to increased demand from consumers and mobile employees.

Needless to say, it can be difficult to keep track of all the content created on a daily basis, as well as manage all the different duties of a software engineer all at once. And that’s where internal search helps increase productivity.

How Cloud-Based, Internal Search Helps Keep Software Developers On Track

As you might imagine, our developers use enterprise search tools frequently. Using our own  Enterprise Search solution internally even allows our developers to address customer service issues in record time, closing some tickets in seconds by quickly cross-referencing with other clients’ information requests via search—surely that’s one for the record books! With tickets in Jira, solutions in Github, and the conversation about it all in Slack, giving developers one tool to find everything can save them a ton of time.

Enterprise search lets developers quickly inspect why certain changes are being made, tracks important and potentially disparate data, and provides the context necessary to rapidly understand a new project. No more wasting time on email streams or knocking on office doors for explanations.

Working with design teams (and their MANY changes) is easier with enterprise search as well. A frequent complaint we hear from developers is that they often don’t know why they are building something or what the end goal is. With all the information about a project easily searchable—from the first creative brief to the final code—no one is left in the dark. Using simple but powerful search queries means hours aren’t wasted wading through Dropbox or Slack. Instead, when a request from marketing comes in, developers can source all the digital assets spread over whichever apps are in use on that project.

The heat is definitely on for software developers, and it’s only getting hotter. If you’d like to explore how Swiftype Enterprise Search could help take the pressure off your team, don’t hesitate to reach out.

Great Developers Ship, They Don’t Configure Search

We’re always excited when Swiftype customers give unprompted kudos to our solutions. Of course, we also work with a lot of our customers to showcase how they use Swiftype.

But when someone who’s not a customer writes about how great our products are after a trial, it makes us a bit more proud to be doing what we’re doing. If our products provide such a great experience that someone needs to tell the world, well that just makes us smile!

One recent example is from David Walsh (@davidwalshblog), a Senior Software Engineer and evangelist for Mozilla, who also runs the wildly popular David Walsh Blog. On his blog, which uses WordPress, David defaulted (as many do) to the out-of-the-box search functionality. And, just like many of you, he found it “underwhelming,” so decided to look for a replacement.

You can read his full post here, which goes into much detail on Swiftype’s features, explains how he set up both Swiftype Site Search and Enterprise Search, and offers his overall impressions. (SPOILER ALERT: He loved both of our solutions!)

Reading David’s post inspired me to write this post because it made me realize how web developers struggle to balance the demands of marketers (like me) against the reality of managing a website. Add to that their desire to work smarter and to work on projects they are passionate about, and it’s easy to see how they can become frustrated with things like lackluster search solutions.

As my team and I spend more and more time speaking with developers specifically about search, we’re seeing clear yet unique needs for both public-facing site search and internal enterprise search.

Site Search for Your Public Audience

Engineers and developers want to spend more time developing products and websites, not configuring search. It’s pretty obvious, and understandable. Developing allows them to be creative, solve problems, and build new things. Search, albeit a critical feature for site visitors, is part of a site’s foundation. It should already be there. And it should work, and work well.

Developers are often asked by marketing or others to tweak search results, which should be easy. If you’re not a developer, you probably assume it’s a simple fix. But that’s not the case with most solutions. “Google Webmaster Tools doesn’t allow me to modify result order so I’m somewhat helpless in correcting the issue on Google, but Swiftype allows me to correct the issue for my own site search,” wrote David in his blog post.

Search is not something that can be created or optimized in a few minutes, especially if your search was custom developed. It’s not much better if your search was created by your blog or site platform, or even if it was created by Google. It’s also why few people build their own (read here why building search is so difficult) and why most people default to WordPress’ canned search or Google Site Search.

With Swiftype, however, tweaking search results is easy. We’ve built our solution with developers in mind, and to make their jobs easier. As David puts it, “All I need to do is drag and drop the result and Swiftype remembers the preferred result order.”

Ultimately, what’s most important is the experience you provide to your site’s visitors. Do you want search to be a frustrating part of that experience or a differentiator? Considering that one-third to one-half of site visitors use search, you’re probably going to want to make it great!

Enterprise Search for Your Internal Customers

Internally, developers have more to consider, since search is on the hook for helping every employee work smarter and faster. David goes into great detail in his post, and he points the finger at the proliferation of specialized web services for making enterprise search such a bear. You might be using HipChat or Slack, plus Dropbox and Box, and GitHub and Jira, plus Salesforce and Zendesk. Again, in David’s words, “We have so many focused services now, however, that we run into a frequent problem:  where the hell do we find anything?”

Working smarter means removing the burden of foundational tasks, like API configurations, from the developer’s workload. Swiftype lets you choose from dozens of prebuilt connectors to speed and simplify a holistic enterprise search. If a connector isn’t available, our APIs enable you to create a secure and unique endpoint in just a few clicks. It’s that easy. Developers can even add intranets and cloud-based repositories to their search results pool by using Swiftype’s web crawler feature. It’s all designed to make search easier for developers so they can quickly get back to developing.

Swiftype's Connector Framework

What’s important here are two things. First, you’re elevating the experience and productivity of your internal customers by helping them quickly find what they need. Second, you’re giving developers more control and more productivity for themselves by making search easier to configure while providing better results.

As David Says, “Give Swiftype a Shot”

I thought about writing a typical marketing “we’re great” conclusion here, but then realized David did a fantastic job of summarizing it on his post:

“Both of Swiftype’s awesome offerings, Site Search and Enterprise Search, are really impressive.  Instead of rolling out your own search or using a lacking free alternative, give Swiftype a shot.”

I couldn’t have said it better myself…so I didn’t!

Swiftype Enterprise Search Expands Connector Platform to Include Atlassian’s Jira and Confluence

With emphasis on worker efficiency at an all-time high, more businesses are turning to Atlassian, a leading provider of software development and collaboration tools. Two popular Atlassian products, Jira and Confluence, help teams work together, build software, and better serve customers. With the continuous expansion of Swiftype’s Connector Framework, we are thrilled to announce our new native connectors to Jira and Confluence.

Expansion is the Key to Success for Enterprise Search
Jira and Confluence are products that are widely used and trusted by millions, so, given the incredible amount of content that is created and stored in these applications, they were obvious choices. The formal and supported connections between Jira, Confluence and the Enterprise Search Platform will make it all the more seamless to find content across multiple applications at once. By bringing Atlassian-supported work into the Swiftype platform, users can quickly discover helpful content to build into projects, tasks, documents and more. Swiftype’s commitment to meeting people where they work continues with these additions, allowing users to search across more data sources without having to leave the application they’re already working in.

Swiftype for JIRA Screenshot

What You Can Expect from Swiftype for Jira and Confluence:

  • Streamlined project management. Most teams use Jira for project management and Confluence for documentation, but they also use a plethora of complementary apps to get their work done, like Github for code collaboration and management, Dropbox to access UI files from design teams, and Help Scout or another customer support management system, to name a few. Swiftype’s integrated Enterprise Search solution helps teams stay agile by enabling effortless incorporation of design thinking, agile development, and release management into their process.
  • Instant, relevant content for all your projects. Imagine you are assigned a pull request in Github, but you don’t have much context for why those changes need to be made. Instead of having to hunt around for similar pull requests in Github, related tasks in Jira or more relevant information in Confluence, you can use the Enterprise Search Chrome extension to immediately see related Jira tickets, documentation in Confluence, sprint planning documents in Google Drive, account records of impacted customers in Salesforce, and any other related content from your different sources.
  • Global collaboration.  Atlassian takes into account the global nature of project development and encourages flexible cross-organization planning. Collaboration can take place across time zones, but also across tools, like Slack. Swiftype Enterprise Search also offers a federated integration with Slack, which allows users to easily pull up any file from any connected content repository, complete with smart filters and AI-based natural language processing, and share it directly with channels or individuals.

Get Started!
We’re excited to welcome Atlassian tools ‘to the family’ of our Enterprise Search connectors. It’s simple to set up. With just a few clicks, your entire library of cloud content is accessible right alongside your Jira and Confluence workflows. Visit us in the Atlassian Marketplace to learn more and sign up for a free trial.

How Swiftype Uses Swiftype:
Part 1 – Developers

I’m Brian, a Software Engineer at Swiftype. I’ve been working a lot on Swiftype Enterprise Search, and I use it every day.

I had our rotating “Support Wizard” hat this week, which means I’m responsible for addressing customer inquiries and cases for the week. Enterprise Search helped me close a customer case in 15 seconds. The customer needed to whitelist our crawler’s IP addresses so we could crawl their site. I went to search.swiftype.com in my browser and searched for “crawler ip ranges.” I clicked the first result from Help Scout and it took me to a recent ticket requesting the same information but from a different customer. Bam! That’s exactly what I was looking for! Case closed.

Brian Stevenson, Engineering Wizard

 

When dealing with code, I use Enterprise Search for a number of different things. The browser extension is super handy when reviewing Pull Requests (PR) in Github. For example, I was looking at a PR that was pulling in a newer version of nokogiri, but it didn’t have a lot of context. All it had was the version bump, the new version of the gem, and small commit message. I opened the Enterprise Search Chrome extension and I was immediately presented with other PRs and Jira tickets related to the same body of work. I was able to click through to those results to get a much better idea of where and why those changes were taking place. At that point, I had much more context and was able to effectively review the changes in front of me. The browser extension is perfect for that – I can open it up on a pull request on Github and see a plethora of additional, relevant PRs and Jira tickets for that area of code.

Using the browser extension with Jira is also super helpful. If I’m looking at a ticket in Jira, it shows me all open pull requests and any other related Jira tickets that may not have been linked. Furthermore, it shows me all of our sprint planning docs in Google Drive and Dropbox, due to our full text extraction capabilities and fine-tuned search algorithms.

One of my favorite things to use Enterprise Search for is when I’m working with our Design team. They create a lot of visual content, like mockups and templates, but where that content is stored in Dropbox isn’t exactly self-evident. So when I’m working on a project that requires implementing their designs, rather than trying to wade through the ocean of digital assets in Dropbox, or bug them to send me an exported version of the new design, I just search for the content in the Enterprise Search app.  I use really simple, but extremely powerful queries like “new dashboard design in dropbox” or “sidebar icons in dropbox.” The search results all have image previews of the visual content they’ve been designing, so I can quickly scan them to find exactly what I’m looking for in an instant.

Enterprise Design Results

I also use Enterprise Search to show me all of the open pull requests assigned to me, across all of our repositories. It’s extremely useful because I don’t have to go to each repository individually to check for those PRs I need to take action on. I also sometimes use it to see PRs assigned to other people, in case they’re out sick, for example.

Speaking of people, the “Person View” is pretty awesome. One of my developers just went on vacation and I needed to be able to see what he was working on to be able to get the work done before the end of the sprint. I just searched for “Chris,” and because he was automatically created as a person in our organization (just by signing up for an account), I was able to see all of his recent changes across all our repositories in Github and other sources. I was able to jump on the highest priority task he was working on and finish it off. Success! I was also able to get more context on the other issues he was working on because I found some conversations he had with other engineers in Slack, and comments he made on tickets in Help Scout.

We also just hired a new engineer (who is coincidentally also named Brian)! I was helping him get up to speed and needed to find this mythical “onboarding” document. I did a quick search for “welcome guide”, and sure enough, the document showed up as the first result. And with a few more quick searches, I was able to find all the other onboarding documents that were scattered around our various cloud services. It’s so handy, and easy, to be able to search and find documents like this. It saves me so much time!

Last but not least, I use the mobile app to receive notifications for upcoming meetings. We have a sprint planning meeting every two weeks, so I get a notification on my phone that says hey, there’s this sprint planning meeting coming up, do you want to review these documents first? And I’m like yeah! I do want to review those docs so I can remember what we’re talking about at sprint planning! Thanks, Swiftype!

Swiftype proposed, and I said yes! A True Love Story in the Making.

I joined Swiftype shortly after graduating from Georgia Tech with a B.S. in Computer Engineering last May. I got to spend four years in Atlanta, which provided me an amazing startup ecosystem that let me invest a significant amount of time while still being a college student. I was able to work on a great team at Springbot and build out and launch an MVP at Stackfolio. I got to venture out a bit and intern at MongoDB last summer as well.

The startups I got to work with varied quite a bit in size, and I decided I wanted to join a startup with a small, somewhat established development team. I think this sweet spot is the best type of environment for growing as a software developer. Swiftype definitely fit that criteria, and much more.

Reasons I ended up signing my life to Swiftype:

  • I can see value in the product.
    • I always default to using the “site:www.website.com” syntax on google instead of using a website’s dedicated search tool. I think it’s silly that the majority of websites get beaten by a generic web crawler at finding their own content.
  • Swiftype gave me the time of day.
    • Quin [the CTO of Swiftype] personally reached out to me the day before my interview with a phone call and a follow-up email to make sure I was doing fine and made my way to San Francisco without issue. He also was very active throughout my entire interview process to make sure everything went smoothly. I got the feeling that he actually (even if just a little bit) cared about me.
    • I got the opportunity to ask in-depth questions about the company and its technology, which caused my interview to run way longer than scheduled. Swiftype was one of the few companies that was happy to take the time to give me in-depth answers.
    • Initial contact to offer was less than three weeks. (Not the quickest of all time, but considering I was on the other side of the world or on a plane for 9 days of that time, I’d say it’s pretty good.
  • I got a clear idea of what I would be doing.
    • More often than not, I think new software developers go into jobs pretty blind on what they’re actually going to do. I learned this the hard way through my first internship! It’s perfectly understandable given many circumstances, and perfectly reasonable for people to put themselves into that situation, but it still makes me very uncomfortable.
  • I knew who I’d be working with.
    • I got to interview with the entire engineering team. I left with the feeling that if I could be where they are when I get to their age, I’d be pretty happy with my career. We’ll see how that turns out.
  • Swiftype aligned with my interests.
    • The vast majority of my abandoned personal projects revolved around scraping data and doing something with it. I only found a select few startups whose business revolved around this concept and actually did meaningful things with it.
  • Super soft hoodies that actually look normal.
    • At least at the time, this was a priority. Unfortunately, not many people or companies actually took me seriously, which is understandable. Regardless, this is my public request for the long awaited Swiftype hoodie V2.
  • Positive Culture inclinations.
    • It’s tough to evaluate culture through interviews that span a short amount of time. But I got the same baseline vibes from the Swiftype engineering team as the friendliest, heartwarming development team I interviewed with in Tennessee. This absolutely wasn’t a priority while I was in the job search, but looking back on it, this definitely helped me make a quick decision to say yes to Swiftype.

*****

Note from the Swiftype Team:
Looking for a new opportunity? Jonesing to work with a talented, up-and-coming software development team? Really into soft hoodies and free lunch? You might be a great fit for the Swiftype team! We’re not on Tinder, but you can check out our careers page for our current openings and apply. 

To Crawl or Not to Crawl: How to Index Data for Site Search

crawler-vs-api

If you’re considering a new site search solution like Swiftype, you’re probably already
aware of the benefits of upgrading your website’s search experience—things like
greater control over search results, a better user experience, and the ability to gather
analytics from user searches. You also know that taking your site search to the next level
will increase conversions and positively impact your company’s bottom line.

But before you can start enjoying the benefits of enhanced site search, there’s one
important decision to make: how to index the content on your site. Indexing lays the
foundation for your search engine by taking inventory of all your site data, then
organizing it in a structured format that makes it easy for the search algorithm to find
what it needs later on. Essentially, if your website is a stack of thousands of papers, the
search index is the mother of all filing cabinets.

There are a few different ways to go about indexing site content, but the two main
options are using a web crawler or a search API. Both choices have pros and cons, so it’s
helpful to understand which one is the best fit for your situation. Here’s the lowdown on
each.

Web Crawler

You may be familiar with Google’s web crawler, Googlebot, which perpetually “crawls” the internet, visiting each available web page and indexing content for potential Google searches. Swiftype’s crawler, Swiftbot, does the same thing for individual websites like yours.

Using a web crawler to index site data has a couple of key advantages. For one thing,
it’s extremely plug-and- play. Rather than pay a team of developers to build the index,
simply select the crawler option and let it do its thing—no coding required.

A crawler also allows you to get your new site search up and running very quickly. For example, Swiftbot creates a search index within minutes by simply crawling your website URL or sitemap. And it stays on top of changes to your site, immediately indexing any new information so that search results always reflect the latest and greatest your business has to offer.

In our experience, the web crawler option works best for the vast majority of our customers. It’s fast and easy to use, yet also creates a powerful, comprehensive search experience that’s a huge improvement over a fragile plugin or other antiquated site search solution. However, there are some situations where the customer needs a greater amount of customization, and in those cases, an API integration might be the way to go.

Developer API

The main advantage of using an API for search indexing is that it gives you full programmatic control over the content in your search engine. There are infinite ways to build a search experience, and an API (like the Swiftype Search API) lets you choose your own adventure and make changes as often as you like.

For example, if you want to index sensitive data that cannot be exposed on your website such as product margins or page views for a particular article, you may want a more custom indexing setup than the one that comes with the web crawler. The developer API allows you granular, real time control over every aspect of your search engine.

Unlike the web crawler option, using an API usually requires a fair amount of coding, so
we usually see this option used by large businesses with bigger budgets and/or a developer team on staff. Also, since an API integration is custom, the initial indexing process can take time to set-up, so it’s less attractive to customers who are anxious to get started.

Which one is best?

The choice between the web crawler and the developer API will come down to your specific situation. Most Swiftype customers are extremely happy with the crawler, but some do require the flexibility and control inherent in the API. We offer both options so that you can choose the best one for your site and business.

No matter which option you choose for indexing data, the ultimate outcome will be an enhanced site search experience that’s more relevant—and more profitable—than your current solution.

How to Index Thumbnails for Crawler Based Engines

As you’re getting started with Swiftype, you may be wondering how to index thumbnails from your website and serve them to users in your search results. The answer to this question lies in using Swiftype’s custom <meta> tags, which allow site owners to pass detailed web page information directly to Swiftbot, our web crawler, as it moves across your site. As Swiftbot encounters these custom Swiftype <meta> tags, it indexes their content and incorporates that information in your search engine index schema.

To index thumbnails from your website, all you need to do is add a Swiftype image <meta> tag to the <head> section of your website template that indicates where images are located on your various page types. For illustration purposes, the Swiftype image <meta> tag is formatted like this:


Swiftype recommends placing these <meta> tags at the template level of your website to ensure that image files are dynamically populated within the tags, rather than being added manually for every page on site.

NOTE: the value of the “content” attribute must be HTML encoded. For more information see this guide.

Alternatively, you can wrap images with a body-embedded Swiftype image <meta> tag to avoid changing your website <head>. For example, Swiftbot will index example.jpg into the image field from the HTML below:

<body>

Hello world

 

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ut risus sed ante dignissim pharetra aliquet a orci. Maecenas varius.

 

In in augue molestie, bibendum velit vel, luctus erat. Curabitur cursus, tellus at feugiat lacinia, tellus est suscipit lectus, non commodo diam elit sit amet justo.

http://fullurl.com/example.jpg </body>

It is important to note that in both the <head> and <body> embedded <meta> tags, you need to specify the data-type attribute as enum. For images, this will always be the case. For any other custom meta tags you choose to define, each attribute must be a valid, Swiftype-supported field type, which you may read about here.

Once you index thumbnails from your website, you can easily customize your search results and autocomplete to feature thumbnails in a range of shapes and sizes with the Swiftype Result Designer.

To learn more about using custom Swiftype <meta> tags to refine your search engine index, check out our tutorial. As always, if you need help or have any questions, feel free to reach out to us at [email protected].

Building an Asynchronous API to Improve Performance

One of the challenges we’ve had to deal with at Swiftype is that we have had customers pushing a lot of search and indexing traffic from very early on. When a customer is pushing hundreds of index updates per second, it’s important to respond quickly so we don’t start dropping requests.

In order to do that, we’ve built bulk create/update API endpoints to reduce the number of HTTP requests required to index a batch of documents and moved most processing out of the web request. We’ve also invested in front-end routing technology to limit the impact customers have on each other.

However, we were not satisfied. Sometimes when a large customer was indexing a huge number of documents, our front-end queues would still back up.  In the pursuit of even better response times for our customers, we’ve built an asynchronous indexing API. Our goals in creating the new API were high throughput, supporting bulk requests for all interactions, and excellent developer ergonomics. We wanted an API that was fast and easy to use.

Here’s how it works.

async_bulk_API_vertical_2.28.39_PM

First, customers submit a batch of documents to create or update. The request for this looks just like our pre-existing bulk create or update API, but goes to a new endpoint.

When our server receives the response, it performs a quick sanity check on the input, without hitting the database. If all the input parameters are present and validly formatted, we create two records in our database for each document that was submitted: a document creation journal, and a document receipt.

For performance, we insert these rows using activerecord-import. This is a great library that uses a single INSERT statement with multiple rows. This results in a massive speed improvement compared to standard ActiveRecord when saving a large number of records. We also generate the IDs ahead of time using BSON. By generating the IDs ahead of time, we don’t need to get them from the database after inserting, and using BSON lets us encode a timestamp in the ID at the cost of a larger ID column.

Once created, we enqueue a message for each document creation journal onto a queue that is read by a pool of loops workers. Loops is a dead-simple background processing library written by our Technical Operations Lead, Oleksiy Kovyrin. It makes it easy to write code that does one thing forever, in this case, reading messages off the queue and creating the associated document in the database.

The response to the API request includes a way to check the status of all the document receipts. To make the API easy to use, we’re including URLs to the created resources. Though we’re not following all its precepts, this approach is inspired by the idea of the hypermedia API. These URLs make it easy for both humans and computers to find the resource.

Since the API is asynchronous, users must poll the document receipts API to check for the status of the document creation. We’ve built an abstraction in our Ruby client library that allows developers to simulate a synchronous request, although we recommend that only for development.

By pushing all work except for JSON parsing and the most minimal input validation to the backend, we’re able to respond to these API requests very quickly. On the backend, the loops workers read messages off the queue and create documents. When a loops worker attempts to create a document, it updates the document receipt (either with the status of “complete” and a link to the created/updated document, or with the status “failed” and a list of error messages) and deletes the document creation journal.

This brings us to one final aspect of the asynchronous API: how we make sure it keeps working. If our loops workers started failing, the document creation journals would back up without being processed, and no documents would be created/updated. To guard against this, we have built a monitoring framework that alerts us when the oldest record in the table is older than a certain threshold.

This solution has been successful for us in beta tests with our largest API users, and we have now rolled it out to everyone.

We hope this helps you build out your next high-throughput API. If this is the kind of thing you’re interested in, we’re hiring engineers for our core product and infrastructure teams.

ObjectIdColumns: Transparently Store MongoDB BSON IDs in a RDBMS

Here at Swiftype, we use both MongoDB and MySQL to store some of our core metadata — not search indexes themselves, but users, accounts, search engines, and so on. As we’ve migrated data from MongoDB to MySQL, we’ve found ourselves needing to store the primary keys of MongoDB documents in MySQL.

While it’s possible to use more-or-less arbitrary data in MongoDB as your _id, very, very frequently you will simply use MongoDB’s built-in ObjectId type. This is a data type similar in concept to a UUID; it can be generated on any machine at any time, and the chance it will be globally-unique is still extremely high. Some relational databases offer native support for UUIDs; we thought, why shouldn’t we teach Rails how to get as close to that ideal as possible with ObjectIds, too?

The result has been our objectid_columns RubyGem, which we are proud to release as open source under the MIT license. Using ObjectIdColumns, you can store MongoDB ObjectId values as a CHAR(24) or VARCHAR(24) (which stores the hexadecimal representation of the ObjectId in your database), or as a BINARY(12), which stores an efficient-as-possible binary representation of the ObjectId value in your database.

No matter how you choose to store this data, it’s automatically exposed from your ActiveRecord models as an instance of the bson gem’s BSON::ObjectId class, or the moped gem’s Moped::BSON::ObjectId class. (ObjectIdColumns is compatible with both equally; the two are extremely similar.)

my_model = MyModel.find(...)
my_model.my_oid # => BSON::ObjectId('52eab2cf78161f1314000001')

You can assign values as an instance of either of these classes, or as a String representation of an ObjectId — in either hex or pure-binary forms — and it will automatically translate for you:

my_model.my_oid = BSON::ObjectId.new # OK
my_model.my_oid = "52eab32878161f1314000002" # OK
my_model.my_oid = "R\xEA\xB2\xCFx\x16\x1F\x13\x14\x00\x00\x01" # OK

ObjectIdColumns even transparently supports queries; the following will all “just work”:

MyModel.where(:my_oid => BSON::ObjectId('52eab2cf78161f1314000001'))
MyModel.where(:my_oid => '52eab2cf78161f1314000001')
MyModel.where(:my_oid => 'R\xEA\xB2\xCFx\x16\x1F\x13\x14\x00\x00\x01'))

Enjoy! Head on over to the objectid_columns GitHub page for more details, or just drop gem ‘objectid_columns’in your Gemfile and go for it!

If you enjoyed the tips in this tutorial, make sure to bookmark our blog and subscribe for more announcements like our new Swiftype Ruby Gem.

Subscribe to our blog