Archive for the ‘Java Box’ Category

I was looking for log analyser for logs from our sytem, i found this was interesting.

1. Scribe – Real time log aggregation used in Facebook
Scribe is a server for aggregating log data that’s streamed in real time from clients. It is designed to be scalable and reliable. It is developed and maintained by Facebook. It is designed to scale to a very large number of nodes and be robust to network and node failures. There is a scribe server running on every node in the system, configured to aggregate messages and send them to a central scribe server (or servers) in larger groups.

2. Logstash – Centralized log storage, indexing, and searching

Logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use. Logstash comes with a web interface for searching and drilling into all of your logs.

3. Octopussy – Perl/XML Logs Analyzer, Alerter & Reporter
Octopussy is a Log analyzer tool. It analyzes the log, generates reports and alerts the admin. It has LDAP support to maintain users list. It exports report by Email, FTP & SCP. Scheduled reports could be generated. RRD tool to generate graphs.

4. Awstats – Advanced web, streaming, ftp and mail server statistics
AWStats is a powerful tool that generates advanced web, streaming, ftp or mail server statistics graphically. It can analyze log files from all major server tools like Apache log files, ebStar, IIS and a lot of other web, proxy, wap, streaming servers, mail servers and some ftp servers. This log analyzer works as a CGI or from command line and shows you all possible information your log contains, in few graphical web pages.

5. nxlog – Multi platform Log management
nxlog is a modular, multi-threaded, high-performance log management solution with multi-platform support. In concept it is similar to syslog-ng or rsyslog but is not limited to unix/syslog only. It can collect logs from files in various formats, receive logs from the network remotely over UDP, TCP or TLS/SSL . It supports platform specific sources such as the Windows Eventlog, Linux kernel logs, Android logs, local syslog etc.

6. Graylog2 – Open Source Log Management
Graylog2 is an open source log management solution that stores your logs in ElasticSearch. It consists of a server written in Java that accepts your syslog messages via TCP, UDP or AMQP and stores it in the database. The second part is a web interface that allows you to manage the log messages from your web browser. Take a look at the screenshots or the latest release info page to get a feeling of what you can do with Graylog2.

7. Fluentd – Data collector, Log Everything in JSON
Fluentd is an event collector system. It is a generalized version of syslogd, which handles JSON objects for its log messages. It collects logs from various data sources and writes them to files, database or other types of storages.

8. Meniscus – The Python Event Logging Service

Meniscus is a Python based system for event collection, transit and processing in the large. It’s primary use case is for large-scale Cloud logging, but can be used in many other scenarios including usage reporting and API tracing. Its components include Collection, Transport, Storage, Event Processing & Enhancement, Complex Event Processing, Analytics.

9. lucene-log4j – Log4j file rolling appender which indexes log with Lucene
lucene-log4j solves a recurrent problem that production support team face whenever a live incident happens: filtering production log statements to match a session/transaction/user ID. It works by extending Log4j’s RollingFileAppender with Lucene indexing routines. Then with a LuceneLogSearchServlet, you get access to your log using web front end.

10. Chainsaw – log viewer and analysis tool
Chainsaw is a companion application to Log4j written by members of the Log4j development community. Chainsaw can read log files formatted in Log4j’s XMLLayout, receive events from remote locations, read events from a DB, it can even work with the JDK 1.4 logging events.

11. Logsandra – log management using Cassandra
Logsandra is a log management application written in Python and using Cassandra as back-end. It is written as demo for cassandra but it is worth to take a look. It provides support to create your own parser.

12. Clarity – Web interface for the grep
Clarity is a Splunk like web interface for your server log files. It supports searching (using grep) as well as trailing log files in realtime. It has been written using the event based architecture based on EventMachine and so allows real-time search of very large log files.

13. Webalizer – fast web server log file analysis
The Webalizer is a fast web server log file analysis program. It produces highly detailed, easily configurable usage reports in HTML format, for viewing with a standard web browser. It andles standard Common logfile format (CLF) server logs, several variations of the NCSA Combined logfile format, wu-ftpd/proftpd xferlog (FTP) format logs, Squid proxy server native format, and W3C Extended log formats.

14. Zenoss – Open Source IT Management
Zenoss Core is an open source IT monitoring product that delivers the functionality to effectively manage the configuration, health, performance of networks, servers and applications through a single, integrated software package.

15. OtrosLogViewer – Log parser and Viewer
OtrosLogViewer can read log files formatted in Log4j (pattern and XMLL yout), java.util.logging. Source of events can be local or remote file (ftp, sftp, sa ba, http) or sockets. It has many powerful features like filtering marking, formatting, adding notes, etc. It could also format SOAP messages in logs.

16. Kafka – A high-throughput distributed messaging system
Kafka provides a publish-subscribe solution that can handle all activity stream data and processing on a consumer-scale web site. This kind of activity (page views, searches, and other user actions) are a key ingredient in many of the social feature on the modern web. This data is typically handled by “logging” and ad hoc log aggregation solutions due to the throughput requirements. This kind of ad hoc solution is a viable solution to providing logging data to Hadoop.

17. Kibana – Web Interface for Logstash and ElasticSearch
Kibana is a highly scalable interface for Logstash and ElasticSearch that allows you to efficiently search, graph, analyze and otherwise make sense of a mountain of logs. Kibana will load balance against your Elasticsearch cluster. Logstash’s daily rolling indicies let you scale to huge datasets, while Kibana’s sequential querying gets you most relevant data quickly, with more as it becomes available.

18. Pylogdb

A Python-powered, column-oriented database suitable for web log analysis pylogdb is a database suitable for web log analysis.

19. Epylog – a Syslog parser
Epylog is a syslog parser which runs periodically, looks at your logs, processes some of the entries in order to present them in a more comprehensible format, and then mails you the output. It is written specifically for large network clusters where a lot of machines (around 50 and upwards) log to the same loghost using syslog or syslog-ng.

20. Indihiang – IIS and Apache log analyzing tool
Indihiang Project is a web log analyzing tool. This tool analyzes IIS and Apache Web logs and generates real time reports. It has Web Log Viewer and analyzer. It is capable to analyze the trend from the logs. This tool also integrate with windows Explorer so you can attach a log file in to indihiang tool via context menu.


Make your Eclipse Faster

Posted: February 11, 2013 by Narendra Shah in Java Box, Reserch And Development

Once we start working on Eclipse, After some time, it usually became slow. To make it fast, there is way to clean up history and indexes which eclipse has created. (if both are not important to you). This helped me to speed up eclipse, and decrease RAM and HDD usage on my system

Here are the 2 folders, which should be cleanup

Index info

Delete the files under both the folder. This is big impact on eclipse performance.

Note: Both folder are part of your Eclipse workspace and not of your Projet source or eclipse installation. You can view workspace location information under Windows->Preferences->(General->Startup and Shutdown->workspaces).

You can even follow below links, if you running slow machine

There some good part ref:

1.  Add following in eclipse.ini


2. Configure Xmx and Xms and permsize according to your system. I have configured mine with following in eclipse.ini


This are all working parameters.

In JDBC application, one and foremost question either to use stored procedure or java Preparedstatement. To search this it took around 2 days and finally understood the concept.

In JDBC when ever we use PreparedStatement  it would be compiled for that particular connection, so once you close the connection and open new connection PrepareStatement would be recompiled. So here if you want to leverage preparestatement then you it should be on single connection, or else if we are performing multiple connection object and using preparedstatement will reduce performance for that statement/query because there will be precompilation process happens at each connection level.  I hope this will be much clear in terms of using PreparedStatement

Using stored procedure is being standard procedure for join or big queries, it is definitely good idea, but it will restrict database portability in your application, once you decide to migrate  the database to another RDBMS system, you may need to rewrite the stored procedure again. In regards to performance benefit of stored procedure, it is giving performance benefit because database engine need not to recompile the sql statement again and again. This will affect microsecond  level, but if  some query is perform frequently or used widely, we should convert it to stored procedure.

Today i came to know how to set default webapplication. I mean that when you install tomcat in your system. When you open your-machine-ip:8080 then tomcat application is displayed but to configure your application on first page is really easy.

It is actually loading root application which comes with your tomcat free.

Just add following in your tomcat-dir/conf/server.xml

Search for host tag and put following lines in between.(search <host )

<Context path=”” docBase=”iview” debug=”0″ reloadable=”true”>

Now one more thing new tomcat(above v5.x) is preserving session between tomcat restart but to stop that you can change tomat-dir/conf/context.xml and un-comment following tag.

<Manager pathname=”” />

You can also change default port for your tomcat. Just change port attribute value to your custome port number in connector tag in server.xml. And you are done.

Apache Linux Tomcat configuration

Posted: April 30, 2009 by Narendra Shah in Java Box

Sharing here link only and not real content. Some times i am having problem to configure tomcat and apache. Each time i need to google but today i thought to put link for it.

HOWTO : Installing Web Services with

Linux / Tomcat / Apache / Struts / Postgresql / OpenSSL / JDBC / JNDI / DBCP

Apache 2.x + Tomcat 4.x + Load Balancing Tutor

When we design Java script, we are missing many things. So now onwards if you want to check your java script code then test it in the following URL.
You just need to copy your java script function and click on JSLint. It shows possible error.

More info. about it is in below given link.

About Robot class and robots.txt

Posted: November 13, 2008 by Narendra Shah in Java Box
Tags: , ,

JDK 1.3.1 includes one class named Robot. which is doint very important work like you can automate one system which will click , type characters every thing withing java class control. If you try it then you can enjoy it. The biggest benefit which i am seeing is that you can automate testing like activity. Or you  can automate some process which is done manually.

Good article about it is …

About robots.txt

The robot exclusion standard, also known as the Robots Exclusion Protocol or robots.txt protocol, is a convention to prevent cooperating web spiders and other web robots from accessing all or part of a website which is otherwise publicly viewable. Robots are often used by search engines to categorize and archive web sites, or by webmasters to proofread source code. The standard complements Sitemaps, a robot inclusion standard for websites.