Published August 29, 2014
Coding , Gentoo , Linux , Mac OS X
As I explained in an earlier post, Vagrant now supports Parallels as a provider. Since I wanted to test how they were working together, I created a standard 64bit Gentoo Linux box that you can download and use. In addition to a standard Gentoo install, the box also comes with Puppet installed, so you can do some actual work on it.
Since I presume you already have the Parallels provider setup by now, this is how you can download and use the box:
vagrant init orpiske/gentoo-linux-64 && vagrant up
After the box is downloaded from the cloud you can use vagrant as usual (ie.: vagrant ssh, etc).
Published August 27, 2014
Coding , DevOps , Linux , Mac OS X
Maybe this is not news anymore, but Vagrant now supports Parallels. It seems to work with Parallels Desktop 8 and above, but I wasn’t able to run it 9 on OS X Yosemite. Upgrading to Parallels Desktop 10 seems to have fixed the issue and it worked like a charm. One additional problem is that there’s a shortage of images in the Vagrant Cloud. Although I believe this will be fixed as the community grows and share more templates on the cloud, this may be an nuisance to some users.
Published July 30, 2014
Coding , DevOps
I have been using Logstash extensively lately. Along with ElasticSearch, it’s a great tool to centralize the logs and simplify access to them. The only difficulty I had was related to supporting multiline log messages, such as those printed by Java stacktraces. I have found some good examples online, but none seemed to work the way I wanted. In some cases, I also got my messages tagged as _grokparsefailure, which indicated that the parser failed to process the regex. I ended up with one that it’s not so different after all but which did match exactly the way we log messages with log4j:
(^.+Exception.+)|(^\s+at .+)|(^\s+... \d+ more)|(^\s*Caused by:.+)
It’s also worth mentioning the Grok Debugger website along with an adequate regex tutorial are two priceless resources to have at hand.
Published July 20, 2014
Analytics , Cassandra , Coding , Hadoop
I am currently working on a data analytics website for my own educational purposes and to fulfil my hacking/learning needs, I decided to use Apache Cassandra as the input/output storage engine for and Apache Hadoop map/reduce job.
The job in question is as simple as it gets: it reads the data from a table stored in a Cassandra database and identifies what are the most commonly used adjectives for each of the major communication service providers (CSPs) in Brazil. After processing, the results are stored in another table in the same Cassandra database. Basically, it is a fancier version of the famous Hadoop word count example.
Unfortunately, there seem to be a lack of modern documentation about integrating Hadoop and Cassandra. Even the official guide seem to be deficient/outdated about this subject. To add insult to the injury, I also wanted to use composite keys, which complicated things further. After reading the example source code in Cassandra source code, I was able to successfully implement a working job.
Despite the lack of documentation and the hacking required to figure out how to make it work, the process is quite simple and even an unexperienced Cassandra/Hadoop developer such as myself can do it without much trouble. In the paragraphs below you will find additional details about the Hadoop and Cassandra integration and what is required to make it work.
Finally, as it’s usual for my coding examples, the source code is available in my Github account under the open source Apache License v2.
Continue reading ‘Using Apache Cassandra with Apache Hadoop’
Published July 1, 2014
Coding , Reference Material
These are just some development-related links and articles I have read in the last weeks which I think are worth mentioning:
Published June 27, 2014
Coding , Reference Material
We all know that JSON and RESTful web services are the new darlings of the Internet and, to some extent, backend development these days. Their simplicity over other mechanisms are, undoubtedly, a good thing. However, a large amount of the backend development still (will continue to) rely on SOAP and other mechanisms to provide services. That’s why it’s so important to understand them. This series or articles from IBM Developer Works can help you understand them:
On the other hand, if you want to understand the RESTful side of the force, you may want to read about Developing RESTful Services using Apache CXF.
Published March 17, 2014
Cloud , Coding
I’ve just published a mini e-book, in Portuguese, about Enterprise Integration with Apache Camel. If you happen to speak Portuguese, you can download it from here.
Published March 15, 2014
Cloud , Coding
Apache Commons Configuration:
It’s pretty common to need to set hostname or a port for your service in OpenShift. If you’re using Apache Commons Configuration, there’s a quick an easy way to access variables exported by the cartridges. You can address the environment variables using the ‘env’ prefix.
Continue reading ‘Quick tips for running Java applications on OpenShift’
Published August 28, 2013
Cloud , Coding , Reference Material
NoSQL databases are some of the hottest topics in the IT industry in the moment. A beginner can easily feel swamped with the amount of documentation available. Since I am a beginner to NoSQL as well, I separated two links which I access every now and then:
A Visual Guide to NoSQL explains how the commonly used NoSQL offerings relate to CAP Theorem.
A Beginner’s Guide to NoSQL is an article, originally written for the Software Developer’s Journal, that explain the basics principles and ideas behind the NoSQL databases.
Published August 26, 2013
Cloud , Coding , Linux
Today I dedicated some time to educate myself about OpenShift, the Red Hat’s Platform-As-A-Service offering. It allow us, developers, to quickly develop, deploy and provide scalable applications over the web.
To learn about it, I decided to deploy a really simple web application. I thought it would be a good idea to deploy the Simple CXF Server example on my free account. You can see it in action here. Because OpenShift documentation is quite extensive, it might be complicated for the beginner like me. So I decided to take notes of my steps while I deployed I simple Apache CXF-based application.
These are the steps I had to do:
Continue reading ‘Running the Simple Apache CXF Server Example on Red Hat Openshift’