Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well I also built a program[1] to help me find an apartment in Vienna, in 2011. We actually needed two apartments, and we wanted them close to one another. We quickly found out how bad real estate websites are with moderatly complex queries even when wanting only a single apartment. With the additional constraint of needed two apartments close to each other... yeah, they were mostly useless.

So I built some shell scripts that scrapped all the websites, and generated some tables of pairs of houses sorted by the distance between them that can be easily read by awk. Then I could run queries on them.

All the scrapping was done with regular expressions, no fancy HTML parsing here.

For the distance calculation, I got the geographic coordinates by piping the address of the residence to Google Maps. I then calculated the geodesic distace between them in my scripts. Initially I wanted to let Google Maps calculate the more useful walking distance between residences, but that made the algorithm O(n²), and I ran into Google API free quota issues even with O(n). Geodesic distace was a good enough proxy though.

Being able to use awk, I could use any kind of arbitrary query I could think of. I filtered all residences that were not direct sales (used an agency), they were unfurnished, that were in a place where I didn't want to live (few such places in Vienna though), that were outside my price range, etc. basic stuff.

However, I could create arbitrary utility functions. For example it was really important for me that apartments were close together. So I was willing to sacrifice location, or the total area, or the numbers of rooms if they were really close, but if they were further away, I required more rooms or better location. No real estate agent or website will be able to do this for you.

In the end, it was too much trouble to rent two apartments, so we only rented one. However, the software was still very useful as it presensed all data in a much useful format, multiplexing data from all websites, and the data coming already filtered.

Plus being all text-based, and this being Unix and all that, I could easily manually input the 100 or so metro stations as "houses", so we could sort apartments based on the distance to the nearest metro station. And again I could create arbitrary utility functions. For example the U3 line is much more important for us than the U6 line, and we really don't care about U2 at all.

[1] https://code.google.com/archive/p/operation-housefinder/



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: