Tuesday, August 25, 2015

My Summer Internship With Polyvore’s Engineering Team

This summer, four awesome interns joined Polyvore’s engineering team. Here, they share what they experienced and learned over the last three months.

Alexis Larry 
Alexis was an engineering intern on the Android team. She is studying computer science and statistics at Carnegie Mellon University. Fun fact: Alexis’ team went paint-balling and she has the scars to prove it. 

This summer I was thrown into Polyvore’s Android codebase and gained a deeper understanding of what it means to be a software engineer. I had never worked with Android before, but my team welcomed me with open arms and helped me realize my new passion for mobile development.

Throughout the summer, I learned the important lesson that it’s okay to make mistakes. One time, my code accidentally broke parts of our Android app, so I worked quickly with my team to learn from my mistakes. Working in Polyvore’s start-up environment empowered me to drive my own projects and ask challenging questions. One of my first projects revolved around UI. With my next project, I wanted to branch out, so I asked to do more in the data layer of the app. Because of my request, my next project got me working with the data layer, and I pushed features live to users around the world.

Brian Wachowicz
Brian was an engineering intern on the Revenue & Data Mining team. He is studying computer science at Carnegie Mellon University. Fun fact: during his internship his team had an offsite at a puzzle room where they paid money to be locked in a room and required to solve a series of puzzles to find the escape key. They didn’t manage to escape in time, but did better than the average.

Working on Polyvore’s Data Mining team provided me a unique opportunity to impact Polyvore’s multiple data systems at a huge scale. Over the course of the summer, I worked with new data pipelines to build more personalized recommendations for Polyvore’s global community.

As an intern, I was surprised to work on important projects that were immediately integrated into our production codebase. Before joining, I assumed we would just be put to work testing existing code, but at Polyvore we got to see our own code make a huge impact on the way the site performed for millions of users. Polyvore’s development cycle is blazingly fast, and the sense of community is incredibly strong. Every evening, I looked forward to going to work the next day.

Jare Fagbemi
Jare is an engineering intern on the Web Shopper Delight team. He is studying computer science at Stanford University. Fun fact: The baked goods Jare's manager brings in at the start of each week are the highlight of his Monday mornings. 

Even though I was extremely excited to start interning this summer, I wasn’t sure what to expect on my first day on the job at Polyvore. I was hoping that I’d get the chance to dive deeper into JavaScript and front-end development than I had in my previous internships. I also wanted to get more exposure to best-practices and to push more code with greater frequency.

It was even better than I had hoped. I’ve learned more at Polyvore about actually writing code than in the last two summers combined. Interning at Polyvore felt very much like being a full-time employee. By my fourth day on the job, I was pushing code to production. I was picking up the same tasks that my mentor and the other full timers around me were. The codebase I spent the most time on was a Node app hosted on a private GitHub repo. The absence of a learning curve not only meant that I could get to work as quickly as possible—it also meant that I could spend the summer actually learning way more about Node and Express—knowledge that I could actually take with me wherever I went.

Shreya Vemuri
Shreya was an engineering intern on the Web Shopper Delight team. She is studying computer science at Carnegie Mellon University. Fun fact: Shreya’s team named each of their sprints after a different Pokemon and were given a mini figure of that Pokemon for their desk.

From day one, I felt like a regular full-time engineer. Polyvore has a high-energy, fast-paced, user/product-focused environment that is also mixed with so much fun and transparency. Everyone was willing to answer all my questions and help me understand what I wanted to know related to engineering, product, metrics and experiment analysis. By day two, I was encouraged to pick up tasks that any full-time engineer would work on, and I shipped code within my first week on the job.

Looking back at the past three months, I am so happy I had the opportunity to learn and grow while making an impact for Polyvore’s global community. When I began my internship, I learned JavaScript and Node.js while working on front-end features. As I worked on this layer, I also became interested in API implementation and expressed my desire to work on that as well. My team and manager were super flexible, allowing me to do just that, and I was able to implement an API endpoint. There is a lot of trust and ownership embedded in Polyvore’s culture, which helps people grow and evolve.

This summer, I enhanced my technical knowledge, contributed to product features used by millions, learned about the company culture at a start-up and the aspects that just make a company click. These were invaluable opportunities for me to grow as both an engineer and a person.

Thank you Polyvore, my manager, mentor and coworkers, for an enriching summer!

Thursday, July 16, 2015

Polyvore’s Summer 2015 Hackathon: An Inside Look at Polyvore’s Hack Culture

By Dan Cox, Head of Engineering at Polyvore

Twice a year, we have week-long hackathons at Polyvore, with a few single-day, mini-hackathons thrown in. Why so many? We have a terrific team, and we want to give them the opportunity to direct the future of Polyvore.

Recently, we completed our summer hackathon, and it rocked! Coming out of that experience, I wanted to share a few tips on how to build an awesome hack culture.

Embracing and Celebrating Failures 
We have two simple hackathon rules: 1) develop something (somewhat) related to Polyvore, and 2) failure is totally okay as long as you learn something from that failure.

There is a difference between failing and trying. Trying means you might not push your effort to completion; you're not fully committed. Failure means you saw the end (and maybe it wasn't pretty), but that's alright because the journey is worthwhile.

Polyvore embraces a culture where we celebrate failures, and turn them into learnings we apply towards the future. A hackathon is a terrific way to help get engineers aligned with this idea because there is less immediate risk. Want to fully redesign our entire desktop? Terrific, hack away! The team understands that what you are delivering is an iteration on an idea, not the final product.

Empowering Our Eng Team
Software engineers tend to be creative and inquisitive people. Many engineers will tell you the reason they got into software had to do with “wanting to see how this works” or “automating away my work”. The types of problems we face as software engineers provide us with an engaging and challenging way to apply our creativity and inquisitiveness to real world problems needing creative solutions. For those of us lucky enough to be paid to write software, it is a match made in heaven.

Even with the opportunity to work on challenging puzzles and debugging issues, engineers need to express their creativity along paths that may or may not have a direct business impact, or may go against the architecture or technology choices of the company where they work. To encourage that kind of creative thinking, we give special awards to stand-out hacks, such as most immediately impactful, most creative, audience favorite and most needle-moving potential.

Everyone Hacks, Not Just Engineers
One of the unique aspects of our hack culture is that we encourage everyone across our three offices and dozens of teams to come together to work on an idea. In the week before the hackathon, we started a Slack channel, #hackathon, to get everyone chatting about potential ideas. Immediately team members from engineering, product, design, marketing, communications, sales and senior management (including our CEO and COO) started submitting ideas and collaborating.

A hackathon is a great opportunity for people that normally don’t work together to collaborate to create a great UX. This year Jess Lee, Polyvore CEO and co-founder, and I both worked on multiple projects. Although our day jobs don’t usually revolve around coding, it was great getting our hands dirty again and contributing code. I won the coveted “Keep Your Day Job” award, which meant “great idea, but horrible code”.

[Yue Wu, software engineer, Jess Lee, co-founder and CEO, Jianing Hu, co-founder] 

Hackathon Results: 
At our recent hackathon, we had 40 people (most working in teams of 2-4 people) contribute 32 ideas. Of those ideas, 5 were shipped within the first 2 weeks, and a total of 19 are currently being developed into user-facing features. 

It was an awesome week, and we’re excited to continue pushing hackathon ideas to production over the next few months. Thanks to everyone that submitted ideas! Can’t wait to see what our next hackathon has in store!

Monday, April 13, 2015

Meet the Engineers Behind Polyvore: Lisa Liang

Get to know the Polyvore engineering team! Here, Lisa talks about the values of our engineering team and how we’re using our data to understand personal style.

What do you do every day?
I work on the Consumer Services team, which builds scalable, flexible and efficient backend systems for our web, iOS, Android and client teams.

How does Polyvore’s scale affect your daily work?
We’re still a relatively small company, but we serve a ton of people, and ingest A LOT of data, so you really have the opportunity to work on a ton of different stuff and be exposed to parts of the business that you normally wouldn’t at a bigger company. We’re also transparent in why we make decisions, so it’s much easier to keep the rest of the company updated on what each team is doing.

Polyvore is known for caring about our community, how do you think those values spill into engineering?
We always, always listen to our users. One thing the community should know is that we monitor and use all your feedback, whether it’s left on the blog posts, tweeted at us, or submitted through the app. Your opinions matter and factor into what we decide to work on every day because without you, there would be no Polyvore. We’re so thankful that the community is so passionate about the product we wake up everyday to work on.

How does your eng team decide what to work on?
My team exists to make our client team’s job easier, so we start by looking at what the web, iOS and Android team decide to work on for the year, and figure out how we can best build what they need under the hood, so we can focus on giving our users the best experience.

What is a technical problem that Polyvore is in a unique position to solve?
Style. We have so much valuable data on what our users like, clip and create sets/collections about. That data reflects how current trends in the fashion world impact your average consumer as well as their individual style. We can analyze that data and work with it so we can create a different style profile for every different one of our users, that gets smarter as you use it and make better recommendations on sets, collections and things we’d think you like.

How would you describe the Polyvore engineering culture?
Our culture is super relaxed. Everyone leaves their ego at the door and there’s a great atmosphere of sharing knowledge and going out of the way to offer support so that we can all build the best product for our users. We try to proceed carefully by weighing the pros and cons. We deploy multiple times every day and work hard to be an agile team that iterates quickly while still making sure we’ve covered pros/cons in everything we do.

Any fun facts we should know about you?
Lisa keeps the secrets of the world in her hair bun.

Friday, March 6, 2015

Cassandra Compaction and Tombstone Behavior: Leveled vs. SizeTiered Compaction

Compactions in Cassandra can be contentious due to their impact on I/O load as well as increased disk space availability requirements. A primer in compaction will be provided, and the differences in Cassandra's data organization and tombstone handling between Leveled and SizeTiered compaction strategies will be discussed.

What is compaction?

Compaction is a maintenance process which re-organizes SSTables to optimize data structures on disk as well as reclaim unused space. It is helpful to understand how Cassandra handles commits to the datastore to understand why compaction is so important to Cassandra's performance and health.

When writing to Cassandra, the following steps take place:

  1. The commit is logged to disk in a commit log entry, and inserted into an in-memory table
  2. Once the memtable reaches a limit on entries, it is flushed to disk
  3. Entries from the memtable being flushed are appended to a current SSTable in the column family
  4. If compaction thresholds are reached, a compaction is run
The key takeaway is that the entry is appended to the current SSTable. Since SSTable entries are immutable, a row in an SSTable cannot be changed once written. For example, a simple schema for a column family might look like:

CREATE TABLE simple_cf (
 id int,
 text1 text,
 text2 text,

Some initial data is populated into the column family:

cqlsh:test> INSERT INTO simple_cf (id, text1, text2) VALUES (1, 'This is a test 1', NULL);
cqlsh:test> UPDATE simple_cf SET text2='This is a test 2' WHERE id=1;

The Cassandra server is flushed (nodetool flush). A (partial) update is performed after the flush:

cqlsh> UPDATE simple_cf SET text2='This is a test 3' WHERE id=1;

Friday, February 27, 2015

Data and the User Experience

Post 2: Data and the User Experience
By Matt Wheeler

Last month we shared some of the technology behind Polyvore’s Style Profile and how we’re using machine learning to understand our users' individual style to recommend more personalized products and outfits.

We discovered that our unique set data (our users create over 3 million sets every month) helps improve the recommendations to create a more engaging shopping and discovery experience for our users. Let’s dive into more detail:

Without divulging too much of the secret sauce, we recently developed three independent algorithms -- or “streams,” as we call them-- that we use to generate product recommendations.
  1. Stream 1: Generates recommendations based on a user’s brand-affinity (a passion for Prada, for example).
  2. Stream 2: Generate recommendations based on collaborative filtering: items that similar users have liked.
  3. Stream 3: Leverages the talent of our awesome community of creators by recommending items frequently paired in sets.

We decided to get a deeper understanding of how users react to the three types of recommendations by launching an experiment that tests each stream. The results are pretty interesting and give us insight into ways that we can improve our user experience.

Which of the three streams do users prefer?
One of the features we make available is the ability to “like” a product. We can use these likes as a tool for measuring the quality of our recommendations. That is, if we show you 20 items and you like 10 of them then we are doing much better than if we show you 20 items and you like 5 of them.

We looked at the distribution of like rates data to understand which streams users prefered:

Figure 1 - Overall like rate distribution (median = black horizontal bar)

The horizontal black lines in the middle of the boxes are the median like rates for each recommendation type. The boxes themselves represent the range of like rates in which most users fall. It is clear that the similar-users-based recommendations are the most popular, followed by the Polyvore set-based recommendations, with the brand-based recommendations bringing up the rear. So, we should focus our energies on increasing the number of impressions from user-based recommendations, right? Well, maybe. The overall variances in the individual like rates is relatively high (the boxes cover a lot of area on the graph).

Does this mean that we have a lot of heavy “likers” and a bunch of “non-likers”? Does it mean that users actually have individual preferences for recommendation type? Or does it mean something else entirely?
To answer these question, we used each user’s likes to create their “perfect mix” of streams and compared it against the average user’s perfect mix. For instance, if we showed you 100 recommendations from each stream and in each stream you liked 10 items, then your perfect mix would be an even 33%/33%/33% mix of recommendations from each stream. If the average Polyvore user’s perfect mix was 40% similar-user-based recommendations, 35% set-based recommendations, and 25% brand-based recommendations then we would say you have a -7% relative affinity for similar-user-based recommendations, a -2% affinity for set-based recommendations, and a +8% affinity for brand-based recommendations. Plotting these affinities per user in ascending order we get the following graphs:

Figure 2 - Distribution of individual preference for each of the three streams. The x-axis is individual users, sorted by increasing stream affinity. A positive .1 means that a user’s ideal mix of streams would add an additional 10% to the average mix (e.g. go from 25% of total individual impressions to 35% of total individual impressions). Note that it is not possible for a stream to be increased in a user’s ideal mix without a decrease in at least one other stream.

In each of these graphs, a large negative value indicates a user who likes the stream much less than the average Polyvore user, while a large positive number indicates someone who likes the stream much more. We can see that, for each stream, there is a non-trivial minority of users who have a strong positive affinity. There is a similarly sized minority who have a strong negative affinity. This suggests that we could improve the individual user experience by showing users more impressions from the streams they like and fewer from those they dislike.

Testing our hypothesis with an experiment:
Another interesting finding concerns the relationship between user’s reactions to the streams. It turns out that there is a pronounced negative correlation between brand-based affinity and similar-user-based affinity:

Figure 3 - 95% prediction interval of similar user affinity regressed on brand affinity

This chart uses the same values as those in Figure 2 and shows that users who have a strong affinity for recommendations based on similar-users generally have a strong negative affinity for brand-based recommendations, and vice versa. Interestingly, this relationship is much weaker when comparing similar-user-based affinity with set-based affinity.

Figure 4 - 95% prediction interval of similar user affinity regressed on set affinity

This tells us that users who like recommendations from other users also appreciate -- or at least are not overly annoyed by -- recommendations based on sets. We see this same weak relationship when we compare brand-affinity folks to the set-based stream.

Figure 5 - 95% prediction interval of brand affinity regressed on set affinity

This is great news for us! Making recommendations based on sets is something that only Polyvore can do, and the data suggests that investing time to improve these recommendations will only complement the user experience. Our creators’ sets are a rich store of fashion data and this set-based product recommendation algorithm only scratches the surface of what’s possible. All is powered by the unique data from our creators’ sets. Stay tuned.  

Wednesday, January 7, 2015

Core of Personalization at Polyvore: Style Profile

Over the past year, our engineering team has undertaken the task of creating a more personalized experience for our users. We already have an amazing community of designers, artists, and fashion enthusiasts who come to Polyvore to get inspired around shopping. However, we felt that with a little bit of machine learning we could help users discover and shop for even more products that they may not have found on their own.
In this blog post we’ll walk through some of the ways we are using machine learning to understand our users individual style, which we call a Style Profile, to recommend more personalized products and outfits.

What is a Style Profile?

When we first started building each user’s Style Profile, we quickly realized how tricky quantifying fashion can be. It’s intangible, means different things for different people and even when most people might own the same black shirt, they might wear it in completely different ways. Luckily, Polyvore is uniquely positioned to understand personal style through our users rich interactions on Polyvore, including:

  • Global factors: occasions, trends, seasonality and other contextual information
  • Catalog data: rich and high-quality metadata of products from our retail partners
  • Product data: product likes and dislikes, collections of products, products viewed and search queries
  • Shopper behavioral data: impressions, likes, outbound clicks while they are interacting with products, sets and other curated content
  • Community data: Our global community has generated billions of data points that helps us understand the relationship between retail products. Every time a user creates a set, they are implying that those products go together and share the same style. 

From a technology standpoint, a user’s Style Profile can be represented with a vector in a high-dimensional space and the component for each dimension, indicating the strength of their preference in a particular aspect or a combination of multiple aspects in fashion. The following is a simplified representation of two users’ style profile on combinations of color, category, material and brand:

Figure 1: Style profiles

  • Style Space Definition: a high dimensional space where any point represents the style of a user or product that is subject to constraints that points with similar style should be closer to each other than those with different tastes.
  • Style Vector Definition: the coordinates in the Style Space denote the taste vector for that particular user or product.