Vert.x Boot first step

So, I happen to find myself with a lot of free time at my disposal. Having spent the last three years head-of-teching in a startup I also have finally the time to think about the technologies we used, the shortcomings, the pitfalls and what I would do different on the next project.

Turns out that I really fell in love with Vert.x as a technology, Java 10 and RxJava2. As you still can see on GitHub we build an awfull lot of components around that. Vert.x is just blazingly fast and easy to understand, RxJava makes for nice reactive streams and modern Java is just fun to work with.

We also had a couple of Spring Boot services, mostly for simplicity and the hype four years ago. In general Spring Boot brings a lot of features, most of them you don’t need or don’t know. It is also quite slow when it comes to startup, in a container, on a VM, in the Cloud.

Now, while I am wondering what to do next and what to do in general I started wondering what I would do knowing what I know now.

We mostly used a stack based on Vert.x, RxJava (sadly 1.x) and Vert.x Jersey. We glued it all together with a couple of abstractions over Verticle deployment and got our own nice stack this way.

Things I would (and have done with newer services) is using RxJava2 as well as replace HK2 with Guice. HK2 is nice but also slightly inferior to Guice in my opinion. For some reason HK2 is not that well maintained, as of last week there was still no support for Java 10 in a released version.

And now I am wondering if I should take this whole stack to a new level and build something like Vert.x Boot, a system that easily lets you bootstrap Vert.x microservices for a variaty of use-cases with the following features:

  • Easy to use bootstrapping process like Spring Boot has
  • Guice as CI and general application configuration approach (I strongly believe in using code for as many things as possible)
  • Vertx-Jersey as a general REST/JAX-RS abstraction
  • Metrics and Prometheus integration out of the box using Micrometer
  • Defined readiness and liveness probe patterns (health checks)
  • Very good RxJava2 integration
  • Minimal dependency footprint that is easy to upgrade in order to follow Java’s amazingly fast upgrade cycle
  • Some well defined testing and containerization patterns (we all build images now, no?) that make it easy to build “golden” images, maybe using Testcontainers

On top of that there should be some modules to integrate with commonly used data layer services and provide the infrastructure that you usally use around that, like connection pooling and schema migrations:

Bonus

  • I really like GraphQL, so something that would abstract around GraphQL Java to build easy to use GraphQL endpoints would also be nice
  • I’m not so sure about my feelings about Jigsaw and the way the Java eco-system is going, but having a good way to work with modules would also be a bonus.

I used to say that Java is not a great fit for Microservices because of memory usage and startup times, but with this stack it might actually make sense. I haven’t yet figured completely out how low you can go, but with a very low old-gen baseline and requests that always run only in your young-generation you can actually run those things with a very small memory footprint.

Maybe that is something to fill my time with once the summer ends.

Berlin Buzzwords 2015 – Day 2

image

 

It’s All Fun And Games Until…: A Tale of Repetitive Stress Injury (Eric Evans)

Basically, watch out for yourself. If it hurts, you are doing something wrong. It’s good to be reminded about that from time to time. So, watch out for yourself:

 

image

A complete Tweet index on Apache Lucene (Michael Busch)

Michael Busch has given this talk in one version or the other for a couple of years now. Unfortunately it got more shallow now, not so many technical details about how they optimized Lucene for Twitter. Numbers are great, they have two billion queries per day and about 500 million tweets per day. One thing that he didn’t mention in his earlier talks is that they actually figured that Earlybird does scale to Twitter level requests due to the Earthquake in Japan when they had to emergency shutdown the caching layer in front (which was Ruby on Rails and did not scale that well). Nowadays they have all tweets in a pretty vanilla Lucene with some additions (going to be open sourced soon) and use a Mesos cluster in case they have to reindex all the data.

And BTW: Tweet IDs encode the timestamp the tweet was send at.

image

Automating Cassandra Repairs (Radovan Zvoncek)

First time speaker Radovan did a pretty good job. Apparently there are several ways to get to the “consistency” of the “eventual consistency” in Cassandra, which are Read Repair, Hinted Handoff and full blown anit-entropy repairs. The latter ones apparently can lead to a lot of problems if done improperly, so Spotify build something to manage that: Reaper.  The problems are usually due to disk IO limits, network saturation or just plain full disks. Spotify Reaper orchestrates anti-entropy repairs to make them reliable.

I’m still somewhat confused that one aparently has to spend a lot of time repairing Casandra clusters. I always thought that was what Cassandra was doing.

image

Diving into Elasticsearch Discovery (Shikhar Bhushan)

For all the people who forgot, like me, how ES clustering works this was a good reminder. Plus I learned that discovery is pluggable, so you can write your own plugin to provide the clustering part for ES. He apparently did and wrote Eskka, an Akka based clustering approach. Writing your own apparently isn’t that much fun because APIs change all the time. Just in case you forgot, Zen is the default way ES clusters.

image

Change Data Capture: The Magic Wand We Forgot (Martin Kleppmann)

We all know the problem Martin was describing: same data in different form, like in your database, in your cache, in your search engine. He went back to the “Change Data Capture” principle, which basically says “save once, distribute everywhere”. So in order to realize that he wrote a PostgreSQL plugin “Bottled Water” which gets the changes from Postgres and posts them to a Kafka topic. Yay for the best project name this year in the category: will never find that on google.

His implementation and idea is solid, the problem is that it is a Kafka topic per table, so you actually loose the transaction when reading from Kafka. Otherwise it is transaction save, messages are only sent when the transaction in Postgres commits. He uses Avro on the wire and transforms the Postgres DDL Schema to an Avro schema.

If you want to get your transaction back you would need a stream processor (Storm/Spark) downstream to reassemble your transactions. Might be a good idea if you already have a Postgres DB or rely on some special properties of a centralized Datastore, otherwise it is OK if your microservices write directly to Kafka.

Has someone actually coined the word “nanoservices” yet for designs that basically do just one thing? Like take the request, write it to a queue (Kafka) and all other processing taking place by consumers down the queue that do just one thing as well.

image

Designing Concurrent Distributed Sequence Numbers for Elasticsearch (Boaz Leskes)

Elasticsearch is rewriting the way they do distributed indexing based on the Raft Consensus Algorithm. Sounds great, they are mitigating a lot of problems they do have right now.

image

Apache Lucene 5 – New Features and Improvements for Apache Solr and Elasticsearch (Uwe Schindler)

Apparently, Lucene 4 broke a lot of indexes due to it’s build in backward compatibility to Lucene <=3. With two big companies actually relying on Lucene, that kind of amazes me.

Lucene 5 gets rid of all this legacy stuff and drops support for older indexes. Plus it adds a lot of data safety features when it comes to on-disk indices like checksums and sequence numbers. So, Solr and Elasticsearch should finally be production ready …. ;).

JDK seems to keep breaking Lucene (remember that the initial JDK 7 release broke Lucene?), apparently one should not use G1 GC with Lucene (es? Solr?).

And Lucene 5 uses a lot of the “new” JDK 7 APIs for IO to finally get the index safely to disk.

 

image

Real-Time Monitoring of Distributed Systems (Tobias Kuhn)

Less distributed, more of Real-Time monitoring. Apparently they build their own system for analyzing their loggs for anomaly detection, punnily named Anna Molly, which was open sourced now.

They made pretty clear that thresholds are not enough if you have a highly dynamic system that can change on multiple dimensions any time. Seasonality of your date makes it even harder to define useful thresholds. There are a couple of algorithms which can be used for anomaly detection, namely Tukey’s outlier detection and seasonal trend decomposition. And T-digest comes to the rescue of course.

For monitoring they actually use a cascade of statsd and carbon.

 

To sum up bbuzz 2015:

 

 

 

 

Berlin Buzzwords 2015 – Day 1

wpid-wp-1433180927768.jpg

Analytics in the age of the Internet of Things (Ludwine Probst)

Basically a talk about analyzing a demo dataset from sports activity via Spark. Not so much new stuff in there, but beautifully manually illustrated slides.

Real time analytics with Apache Cassandra and Apache Spark (Christopher Batey)

image

Good speaker, awesome talk. Some takeaways:

  • One should read the dynamo paper
  • You can (mis)use the datacenter awareness of Cassandra for isolating workloads if you run spark on top of it.
  • 500ms is the lowest usefull microbatch length

Application performance management with open source tools (Tudor Golubenco, Monica Sarbu)

image

Packetbeat, which apparently just joined Elastic, is a TCP Layer application monitoring solution. So what they basically do is understand your protocol (HTTP, Redis, Postgres) and give you metrics directly from the traffic (how long did my HTTP request take?). Sound really interesting, as it doesn’t need any integration. Will be integrated into the ELK-Stack and get some more data providers. I really like the idea.

Practical t-digest Applications (Ted Dunning)

image

t-digest is an algorithm to get realtime quantiles/percentiles out of your data. That comes in handy if you want to have the data always at your fingertips and/or want to identify outliers. It is blazingly fast and needs constant memory, so you actually want to have it wherever you have numbers. Of course there is a Java-Library and an integration into Elasticsearch. Awesome speaker as well.

The Do’s and Don’ts of Elasticsearch Scalability and Performance (Patrick Peschlow)

image

Basically a long reminder to RTFM. Know what data you need, now the pitfalls, disable features you don’t need and make sure that your cluster setup fits your requirements.

Detecting Events on the Web with Java, Kafka and ZooKeeper (James Stanier)

image

Good speaker, I think they build quite interesting stuff at Brandwatch. It was not clear to me till the end why the built all that stuff themselves, but I think they co-evolved with Storm/Spark and just made their existing software cluster aware rather than rewriting the stuff.

Reminded me that there is Apache Curator, a set of high level abstractions for Zookeeper services (https://curator.apache.org/)

Analyzing and Searching Streams of Social Media at Scale using Spark, Kafka and Elasticsearch (Markus Lorch)

IBM is using Spark. Basically they got a pretty standard setup to get a lot of data from Twitter and enrich/augment that with some of their proprietary tech (mood detection etc.). Nothing special here.

Predictive Insights for IT Operations (Omer Trajman)

Actually a pretty good speaker, but I didn’t really get the whole point. He basically explained that you should use the same big data techniques for analyzing the data that comes out of your operations measurements (and btw, he has a company specializing on that). But it wasn’t a sales talk. So basically, yes, analyze all the data.

 

Programming Concurrency on the JVM

The fun thing about Venkat Subramaniam’s latest book is the way he jumps to and fro between a couple of JVM based languages: Java, Scala and Clojure. He shows interesting ways to do given tasks in different languages and introduces a couple of interesting frameworks I at least have not heard of before. The most interesting frameworks he shows are Akka, which is an Actor Based Concurrency Framework, written in Scala but which comes with an API that is just as well usabale from Java. Furthermore, I knew the concepts of Software Transactional Memory but did not know that there are working implementations for the JVM. One of them beeing Multiverse.

 

I really liked the book, because I really like the idea of using a framwork implemented in Scala through its Java API in Groovy. Sometimes it gets quite tiresome to have many of the same examples just shown in different languages, but it is always good to practice Scala or Clojure reading skills. If you feel safe in Java concurrency, i.e. you now Concurrency in Practice by heart, I warmly recommend this book.

Secrets of the JS Console

For some reason, up to until three weeks ago, I didn’t know that there is more to an API in the JS console than “console.log”. They API has been defined by Firebug first and is now more or less considered standard in all major browsers. Here is a short list of usefull commands:


$$('.foo') // gives you a shortcut to querySelectorAll

$0 // gives you the result of the last executed call

$1 // gives you access to the currently selected DOM Element in the Inspect view

clear() // clears the console

time('foo') // start Timing for foo

timeEnd('foo') // end Timing for foo

inspect( domElement  ) // shows the given dom element in Inspect view

The documentation can be found either in the Firebug Wiki or at Chromes Devtools Docs.

Google Merchant Module for ROME

Google Merchant is the way a company can feed their products into Google Shopping and ROME is a Java Library to create and parse RSS and ATOM feeds. Google Merchant allows RSS as one of the feed formats you can use to feed them their data. In order to get additional information into ROME created feeds you have to implement 4 classes, so it is not really easy to get your data there.

I created a small Library to make RSS feeds that conform to Google Merchants requirements:

        final SyndFeed feed = new SyndFeedImpl();
        feed.getModules().add( new GoogleMerchantModuleImpl() );

        final SyndEntry entry = new SyndEntryImpl();

        entry.setTitle( "Title" );

        final GoogleMerchantModule merchantData = new GoogleMerchantModuleImpl();
        merchantData.setImageLink( "SOME IMAGE URL" );

        entry.getModules().add( merchantData );

It’s under MIT License up in Github, you can direct download the Version 0.1 here.

Code with style: Readability is everything

So, by now everybody got it: Code is there to be read by people, to be analyzed by people and to be understood by people. The fact that you can put it through a compiler and run it is a nice sideffect, but nothing to focus on. Besides writing readable tests of course.

But when software is growing and many different hands touch the same spots, it somehow gets dirty. So even when you have usually quite high coding standards, it still can happen that I stumble upon something like this:

User user = userService.createNewUser( email, password,
                                       false, true, null, 5 );

I like to be able to read a line of code like a line of text. Which means that I at least want to be able to get the “words” without the surrounding context.

So what would you be able to tell about the above snipplet? I would say: it apparently creates a new user, using some service and it requires an email and a password, which might be stored somewhere. And the point that makes me cry: apparently some magic flags, a nullable parameter and a magic number.

So lets see how we can clean this up with some new patterns. I usually make up my own names, if someone has a rather more common name for them feel free to comment, I will  be happy to replace them.

Enum as Flag Pattern (aka do not use boolean literals) 


enum FailIfExists { YES, NO };
enum NewPasswordChecks { YES, NO }

User user = userService.createNewUser( email, password,
                                       FailIfExists.NO,
                                       NewPasswordChecks.YES,
                                       null, 5 );

So, what can you tell about the method call now. The problem of methods that react to flags aside, the readability is better. You at least can deduct now that there might be no error if the user already exists and that it uses the “new” password checks. It is even easier to refactor to use the third password validation algorithm should it ever be changed again.

When setting up a new project you might try to disallow all boolean literals in certain high level classes, I don’t know if it works that well with third party libraries. This might be a cool Checkstyle rule to try.

No nullable Parameters Pattern (aka keep your implementation details internal)

So what is my next problem? Well, the given null parameter value. I believe that the ability to cope with null values is an implementation detail that belongs in the class that defines a method. So the UserService interface should provide us with an overridden version createNewUser that does not have the fifth parameter. Then the implementation could hide the fact that this parameter is really nullable. And it avoids the clutter of methods that have n optional object parameters.

If you use Findbugs in combination with the great JSR 305 annotations, which by the way can be enforced using a Checkstyle plugin, you might try to disallow using the Nullable Annotation in public methods. Maybe even for protected methods. In any case, you should never have to use a null parameter while calling a visible method of another class.

No literals outside constant definitions (aka give names to your magic numbers)

The last thing is a classic, but there are still a couple of people who do not use constants instead of literals. I think the general rule is that you should never use a numeric literal inside a method, but always a class declared constant. Furthermore this might even be extended to be valid for String literals and as told in the first point to boolean literals.

So let’s revisit my short (and bad) example taking the above mentioned points into consideration:


enum FailIfExists { YES, NO };
enum NewPasswordChecks { YES, NO }

private static final int INITIAL_NUMBER_OF_INVITES = 5;

User user = userService.createNewUser( email, password,
                                       FailIfExists.NO,
                                       NewPasswordChecks.YES,
                                       INITIAL_NUMBER_OF_INVITES );

Given that you might see this lines of code during a review, where you can’t browse the complete source of your project, one might now be able to better understand what this method does: it creates a new user with the given email and passwords, succeeds even when the user already exists, uses some mythical new password check and gives him an initial number invites of five. This can be guessed by just reading code, without a single line of JavaDoc and even without once visiting the declaring class or interface.

So in future, when writing code, try to consider if you would understand your method calls without ever having seen the implementation or even the declaration of the called method.

Quickhacks: Visualizing contributions with git

Contributions for Git

So I had a little fun with git history visualization. Gource is there to show the progress of a repo over time, but I always wanted to get an actual snapshot of the repo at a given point in time. I wrote a little program, that basically does a “git ls-files” and then a “git blame -e” for all files that seem to  be source files. Then it resolves the Email-Addresses using gravatar and scales the image relative to the amount of lines the contributor is currently blamed for. The above image shows the visualisation for Git itself. It still needs some polishing, the algorithms for packing and sizing are all a bit off and not really correct, but mostly it creates nice imags. It is up on github if you want to play with it.

Contributions by Author for Prototype (the js library)
Contributions for pdf.js

Quickhacks: Github Event Widget for WordPress

A couple of days ago Github announced their new Event timeline API. As I wanted to have a nice widget for this blog but didn’t want to write any PHP I thought maybe they would do PJSON and I could implement a client side pure JavaScript solution. Apparently, they accept a “?callback=” Parameter to the request, so I using jQuerie’s excelent Ajax abstraction it’s plain simple to get your own feed:

var user = "marcus";
var url = "https://api.github.com/users/" + user  "/events?callback=?";

jQuery.ajax( {
             url : url,
             dataType : 'jsonp',
             success : handleData
            } );

Now you just have to implement the handleData method to do something nice to the data. Throw  in the excelent Moment.js and you got what you see to the right. The full script is in a Gist. 150 lines is still a little long but I’m not really in a golfing mood right now.
To integrate use a plain Text Widget for WordPress which includes the two scripts, given that jQuery is already somewhere there:
<script src="https://raw.github.com/timrwood/moment/master/moment.min.js"></script>
<script src="https://raw.github.com/gist/1328963/2054490501f09765cca086127377a9add4781f2c/github-events.js"></script>

<ul id="github"><ul/>
Feel free to fork and create your own. I did not map all events because I just wanted a selection to be displayed.

Code with style: Exception handling antipatterns

There a couple of exception handling patterns that I come across regularly that I believe to be plain wrong or harmful. In order to get rid of them I’d like to present them here and explain the reasons why they should be avoided:

Log-Log-Log and Rethrow

Imagine a typically layered application design where you have a DAO layer at the bottom, a service or business logic layer in between and a frontend layer on top. Now imagine that you decided to wrap most exceptions from the lower layer and rethrow them with an exception which is more apropriate to the level of abstraction that you are currently on:

DAO Layer:

public T getById( Id id ) {
try {
 ....
 } catch( Exception e ) {
 LOG.error( e.getMessage(), e );
 throw new WrapperException( e );
 }
}
 

Service Layer:

public T doSomething( Id id )
try {
 T foo = getById( id );
....
 } catch( Exception e ) {
 LOG.debug( e.getMessage(), e );
 throw new WrapperException( e );
 }
}

This happens again in the UI layer. You usually do not have per layer logfiles, as this would make it harder to see the flow of the application. But when you log the exception on every layer, it leads to a lot of log clutter. The best way is to define a layer that logs the exception, the business layer being the obvious candidate here. Thus you have a defined way to handle lower layer exceptions. This pattern does not appear that often anymore, as unchecked exceptions are generally considered the better solution today.

Fail Silently

There are two versions of the fail silently antipattern:

Version 1:

 try {
....
} catch (Exception e ) {
// empty block<
 }

Version 2:

 try {
....
 return true;
} catch (Exception e ) {
 return true;
 }

The two versions of the Fail Silently Pattern have the same problem: you will never know that an exception happened. Combined with the fact that Exception is caught all kinds of runtime problems can be hidden. Usually, you should always catch specialized exceptions that you expect and can handle. If you’re doing frameworks or using libraries that might throw all kinds of exceptions, or you write handling code that needs to handle all kinds of exceptions on some way you still must log what is happening there. Otherwise you will never be able to explain certain behavior in production. And an exception handling block should never return the same value as the usual code path. After all it is an exception handling block.

Fail with least possible information

 try {
....
} catch (Exception e ) {
LOG.error("Something went wrong");
 }

There are two simple rules when handling exceptions: always log the exception including the stacktrace and always include any data which might have lead to the stacktrace. I still see it today that people discard all information directly at their fingertips to log out what they think might have happened at that point. Or what they have expected to happen. Together with to wide catch clauses this leads to the problem that you do not really know what happened in production, you just believe you know. Even worse, you might stick to assumption someone else made about the error at the point. So when you write a try-catch-block, always see that the variables used in the try are also logged out in the catch.

Deduct from vagueness

 try {
....
} catch ( Exception e ) {
 throw new CustomerNameTakenException( "The user already exists" );
 }

This is somehow the worst case of the above mentioned pattern. Here not only the wrong information is logged but it is used to create error messages. This might lead to all kinds of wrong behavior. Just because you know that a certain layer or lower level function behaves in a certain way does not make it an interface. Maybe a different exception is thrown, maybe something else went wrong.

As usual, these points are open for discussion. But I believe that these are some general rules everyone needs to apply when writing exception handling code.