Business Java Performance Tuning Oreilly Pdf


Saturday, April 13, 2019

Where those designations appear in this book, and O'Reilly Media, Inc. .. Java performance covers both of these areas: tuning flags for the. Java Performance Tuning provides all the details you need to know to " performance tune" any type of Java program and make Java code run significantly faster. mystical poems of rumi Translated from the Persian by ecogenenergy.infoy Annotated and prepared Mystical Poems of Rumi Designing for Internet of things.

Java Performance Tuning Oreilly Pdf

Language:English, Spanish, French
Genre:Business & Career
Published (Last):11.02.2016
ePub File Size:27.74 MB
PDF File Size:17.63 MB
Distribution:Free* [*Regsitration Required]
Uploaded by: WINTER

For this reason, Java Performance Tuning, Second Edition includes . editions are also available for most titles ( For more The slower VMs benefit from manual unrolling, whereas the faster, server-mode VMs still. Reader Reviews. •. Errata. Java™ Performance Tuning, 2nd Edition. By. Jack Shirazi. Publisher.: O'Reilly. Pub Date.: January ISBN.: Java, the cover image, and related trade dress are trademarks of O'Reilly Media,. Inc. . Performance tuning is a synthesis between technology, methodology.

And I am sure that this book will be very useful when I have to analyze the next performance issue. I recommend it to every experienced Java developer who likes to learn more about performance optimization. But lets get into more details … About the author Scott is working as an architect at Oracle. He is working on the performance of their middleware software.

About the book The first and current edition of the book was released in It has pages and is divided into 12 chapters. The first three chapters provide lots of introductory and methodically content. Chapter 1 gives a short introduction into the book. In chapter 2 and 3 Scott explains how to do performance tests and recommends several tools for it. This is followed by a good explanation of the JIT Compiler in chapter 4. After reading this chapter, you really know how the JIT compiler decides which code to compile and which to deoptimize.

You also learn what you can do to tune it. Repeat from Step 1. This procedure gets your application tuned the quickest. The advantage of choosing the "quickest to fix" of the top few bottlenecks rather than the absolute topmost problem is that once a bottleneck has been eliminated, the characteristics of the application change, and the topmost bottleneck may not need to be addressed any longer.

However, in distributed applications I advise you target the topmost bottleneck. The characteristics of distributed applications are such that the main bottleneck is almost always the best to fix and, once fixed, the next main bottleneck is usually in a completely different component of the system.

Although this strategy is simple and actually quite obvious, I nevertheless find that I have to repeat it again and again: once programmers get the bit between their teeth, they just love to apply themselves to the interesting parts of the problems.

After all, who wants to unroll loop after boring loop when there's a nice juicy caching technique you're eager to apply? You should always treat the actual identification of the cause of the performance bottleneck as a science, not an art.

The general procedure is straightforward: 1. Measure the performance by using profilers and benchmark suites and by instrumenting code. Identify the locations of any bottlenecks. Think of a hypothesis for the cause of the bottleneck.

Consider any factors that may refute your hypothesis. Create a test to isolate the factor identified by the hypothesis. Test the hypothesis. Alter the application to reduce the bottleneck. Test that the alteration improves performance, and measure the improvement include regression-testing the affected code.

Here's the procedure for a particular example: 1. You run the application through your standard profiler measurement. Looking at the code, you find a complex loop and guess this is the problem hypothesis. You see that it is not iterating that many times, so possibly the bottleneck could be outside the loop confounding factor. You could vary the loop iteration as a test to see if that identifies the loop as the bottleneck.


However, you instead try to optimize the loop by reducing the number of method calls it makes: this provides a test to identify the loop as the bottleneck and at the same time provides a possible solution. In doing this, you are combining two steps, Steps 5 and 7. Although this is frequently the way tuning actually goes, be aware that this can make the tuning process longer: if there is no speedup, it may be because your optimization did not actually make things faster, in which case you have neither confirmed nor eliminated the loop as the cause of the bottleneck.

This method may still be a candidate for further optimization, but nevertheless it's confirmed as the bottleneck and your change has improved performance. Already done, combined with Step 5. Already done, combined with Step 6. The user of an application sees changes as part of the performance. A browser that gives a running countdown of the amount left to be downloaded from a server is seen to be faster than one that just sits there, apparently hung, until all the data is downloaded.

People expect to see something happening, and a good rule of thumb is that if an application is unresponsive for more than three seconds, it is seen as slow.

Some Human Computer Interface authorities put the user patience limit at just two seconds; an IBM study from the early '70s suggested people's attention began to wander after waiting for more than just one second. A few long response times make a bigger impression on the memory than many shorter ones.

With a typical exponential distribution, the 90th percentile value is 2. Consequently, as long as you reduce the variation in response times so that the 90th percentile value is smaller than before, you can actually increase the average response time, and the user will still perceive the application as faster.

For this reason, you may want to target variation in response times as a primary goal. Unfortunately, this is one of the more complex targets in performance tuning: it can be difficult to determine exactly why response times are varying.

If the interface provides feedback and allows the user to carry on other tasks or abort and start another function preferably both , the user sees this as a responsive interface and doesn't consider the application as slow as he might otherwise. If you give users an expectancy of how long a particular task might take and why, they often accept this and adjust their expectations. Modern web browsers provide an excellent example of this strategy in practice.

People realize that the browser is limited by the bandwidth of their connection to the Internet and that downloading cannot happen faster than a given speed. Good browsers always try to show the parts they have already received so that the user is not blocked, and they also allow the user to terminate downloading or go off to another page at any time, even while a page is partly downloaded.

Generally, it is not the browser that is seen to be slow, but rather the Internet or the server site. In fact, browser creators have made a number of tradeoffs so that their browsers appear to run faster in a slow environment. I have measured browser display of identical pages under identical conditions and found browsers that are actually faster at full page display but seem slower because they do not display partial pages, download embedded links concurrently, and so on.

Modern web browsers provide a good example of how to manage user expectations and perceptions of performance. However, one area in which some web browsers have misjudged user expectation is when they give users a momentary false expectation that operations have finished when in fact another is to start immediately. This false expectation is perceived as slow performance. This frustrates users who initially expected the completion time from the first download report and had geared themselves up to do something, only to have to wait again often repeatedly.

A better practice would be to report on how many elements need to be downloaded as well as the current download status, giving the user a clearer expectation of the full download time. Where there are varying possibilities for performance tradeoffs e. It is better to provide the option to choose between faster performance and better functionality.

When users have made the choice themselves, they are often more willing to put up with actions taking longer in return for better functionality. When users do not have this control, their response is usually less tolerant.

This strategy also allows those users who have strong performance requirements to be provided for at their own cost. But it is always important to provide a reasonable default in the absence of any choice from the user. Where there are many different parameters, consider providing various levels of user-controlled tuning parameters, e.

This must, of course, be well documented to be really useful. This time can be used to anticipate what the user wants to do using a background low-priority thread , so that precalculated results are ready to assist the user immediately. This makes an application appear blazingly fast. Similarly, ensuring that your application remains responsive to the user, even while it is executing some other function, makes it seem fast and responsive.

For example, I always find that when starting up an application, applications that draw themselves on screen quickly and respond to repaint requests even while still initializing you can test this by putting the window in the background and then bringing it to the foreground give the impression of being much faster than applications that seem to be chugging away unresponsively.

Starting different word-processing applications with a large file to open can be instructive, especially if the file is on the network or a slow removable disk. Some act very nicely, responding almost immediately while the file is still loading; others just hang unresponsively with windows only partially refreshed until the file is loaded; others don't even fully paint themselves until the file has finished loading.

This illustrates what can happen if you do not use threads appropriately. In Java, the key to making an application responsive is multithreading. Use threads to ensure that any particular service is available and unblocked when needed. Of course, this can be difficult to program correctly and manage. Handling interthread communication with maximal responsiveness and minimal bugs is a complex task, but it does tend to make for a very snappily built application.

For example, a request to list all the details on all the files in a particular large directory may not fit on one display screen. The usual way to display this is to show as much as will fit on a single screen and indicate that there are more items available with a scrollbar.

Other applications or other information may use a "more" button or have other ways of indicating how to display or move on to the extra information. In these cases, you initially need to display only a partial result of the activity.

This tactic can work very much in your favor. For activities that take too long and for which some of the results can be returned more quickly than others, it is certainly possible to show just the first set of results while continuing to compile more results in the background. This gives the user an apparently much quicker response than if you were to wait for all the results to be available before displaying them.

Review “Java Performance: The Definitive Guide” by Scott Oaks

This situation is often the case for distributed applications. A well-known example is again! The general case is when you have a long activity that can provide results in a stream so that the results can be accessed a few at a time. For distributed applications, sending all the data is often what takes a long time; in this case, you can build streaming into the application by sending one screenful of data at a time.

Also, bear in mind that when there is a really large amount of data to display, the user often views only some of it and aborts, so be sure to build in the ability to stop the stream and restore its resources at any time. Caching is an optimization technique I return to in several different sections of this book when appropriate to the problem under discussion.

Some caches cannot be tuned at all; others are tuneable at the operating-system level or in Java. Where it is possible for a developer to take advantage of or tune a particular cache, I provide suggestions and approaches that cover the caching technique appropriate to that area of the application.

In cases where caches are not directly tuneable, it is still worth knowing the effect of using the cache in different ways and how this can affect performance. For example, disk hardware caches almost always apply a readahead algorithm: the cache is filled with the next block of data after the one just read. This means that reading backward through a file in chunks is not as fast as reading forward through the file. Caches are effective because it is expensive to move data from one place to another or to calculate results.

If you need to do this more than once to the same piece of data, it is best to hang onto it the first time and refer to the local copy in the future. This applies, for example, to remote access of files such as browser downloads.

The browser caches the downloaded file locally on disk to ensure that a subsequent access does not have to reach across the network to reread the file, thus making it much quicker to access a second time. It also applies, in a different way, to reading bytes from the disk. Here, the cost of reading one byte for operating systems is the same as reading a page usually 4 or 8 KB , as data is read into memory a page at a time by the operating system.

If you are going to read more than one byte from a particular disk area, it is better to read in a whole page or all the data if it fits on one page and access bytes through your local copy of the data.

General aspects of caching are covered in more detail in Section Caching is an important performancetuning technique that trades space for time, and it should be used whenever extra memory space is available to the application.

Before you start tuning, it is crucial to identify the target response times for as much of the system as possible. At the outset, you should agree with your users directly if you have access to them, or otherwise through representative user profiles, market information, etc.

The performance should be specified for as many aspects of the system as possible, including: Multiuser response times depending on the number of users if applicable Systemwide throughput e.

(O'Reilly) - Java Performance Tuning.pdf

Otherwise, you will not know where to target your effort, how far you need to go, whether particular performance targets are achievable at all, and how much tuning effort those targets may require. But most importantly, without agreed targets, whatever you achieve will tend to become the starting point. The following scenario is not unusual: a manager sees horrendous performance, perhaps a function that was expected to be quick, but takes seconds.

His immediate response is, "Good grief, I expected this to take no more than 10 seconds. The manager's response is now, "Ah, that's more reasonable, but of course I actually meant to specify 3 seconds—I just never believed you could get down so far after seeing it take seconds. Now you can start tuning. Agreeing on targets before tuning makes everything clear to everyone.

These are precise specifications stating what part of the code needs to run in what amount of time. Without first specifying benchmarks, your tuning effort is driven only by the target, "It's gotta run faster," which is a recipe for a wasted return. You must ask, "How much faster and in which parts, and for how much effort? You must specify target times for each benchmark. You should specify ranges: for example, best times, acceptable times, etc.

These times are often specified in frequencies of achieving the targets. Note that the earlier section on user perceptions indicates that the user will see this function as having a 5-second response time the 90th percentile value if you achieve the specified ranges.

You should also have a range of benchmarks that reflect the contributions of different components of the application. If possible, it is better to start with simple tests so that the system can be understood at its basic levels, and then work up from these tests. In a complex application, this helps to determine the relative costs of subsystems and which components are most in need of performance-tuning. The following point is critical: Without clear performance objectives, tuning will never be completed.

This is a common syndrome on single or small group projects, where code keeps being tweaked as better implementations or cleverer code is thought up. Your general benchmark suite should be based on real functions used in the end application, but at the same time should not rely on user input, as this can make measurements difficult.

Any variability in input times or any other part of the application should either be eliminated from the benchmarks or precisely identified and specified within the performance targets. There may be variability, but it must be controlled and reproducible. However, because their focus tends to be on robustness testing, many tools interfere with the application's performance, and you may not find a tool you can use adequately or cost-effectively.

If you cannot find an acceptable tool, the alternative is to build your own harness. In addition, some Java profilers are listed in Chapter Your benchmark harness can be as simple as a class that sets some values and then starts the main method of your application.

About the book

A slightly more sophisticated harness might turn on logging and timestamp all output for later analysis. GUI-run applications need a more complex harness and require either an alternative way to execute the graphical functionality without going through the GUI which may depend on whether your design can support this , or a screen event capture and playback tool several such tools exist[3]. In any case, the most important requirement is that your harness correctly reproduce user activity and data input and output.

Normally, whatever regression-testing apparatus you have and presumably are already using can be adapted to form a benchmark harness. Robot class, which provides for generating native system-input events, primarily to support automated testing of Java GUIs.

The benchmark harness should not test the quality or robustness of the system. Operations should be normal: startup, shutdown, and uninterrupted functionality.

The harness should support the different configurations your application operates under, and any randomized inputs should be controlled, but note that the random sequence used in tests should be reproducible. You should use a realistic amount of randomized data and input. It is helpful if the benchmark harness includes support for logging statistics and easily allows new tests to be added. The harness should be able to reproduce and simulate all user input, including GUI input, and should test the system across all scales of intended use up to the maximum numbers of users, objects, throughputs, etc.

You should also validate your benchmarks, checking some of the values against actual clock time to ensure that no systematic or random bias has crept into the benchmark harness.

For the multiuser case, the benchmark harness must be able to simulate multiple users working, including variations in user access and execution patterns. Without this support for variations in activity, the multiuser tests inevitably miss many bottlenecks encountered in actual deployment and, conversely, do encounter artificial bottlenecks that are never encountered in deployment, wasting time and resources.

It is critical in multiuser and distributed applications that the benchmark harness correctly reproduce user-activity variations, delays, and data flows. The benchmarks should be run multiple times, and the full list of results retained, not just the average and deviation or the ranged percentages. Also note the time of day that benchmarks are being run and any special conditions that apply, e. Sometimes the variation can give you useful information.

It is essential that you always run an initial benchmark to precisely determine the initial times. This is important because, together with your targets, the initial benchmarks specify how far you need to go and highlight how much you have achieved when you finish tuning. It is more important to run all benchmarks under the same conditions than to achieve the end-user environment for those benchmarks, though you should try to target the expected environment.

It is possible to switch environments by running all benchmarks on an identical implementation of the application in two environments, thus rebasing your measurements.

But this can be problematic: it requires detailed analysis because different environments usually have different relative performance between functions thus your initial benchmarks could be skewed compared with the current measurements.

Each set of changes and preferably each individual change should be followed by a run of benchmarks to precisely identify improvements or degradations in the performance across all functions. A particular optimization may improve the performance of some functions while at the same time degrading the performance of others, and obviously you need to know this.

Each set of changes should be driven by identifying exactly which bottleneck is to be improved and how much of a speedup is expected. Rigorously using this methodology provides a precise target for your effort.

Stay ahead with the world's most comprehensive technology and business learning platform.

You need to verify that any particular change does improve performance. It is tempting to change something small that you are sure will give an "obvious" improvement, without bothering to measure the performance change for that modification because "it's too much trouble to keep running tests".

But you could easily be wrong. Jon Bentley once discovered that eliminating code from some simple loops can actually slow them down. Dobb's Journal, May Sign up. Find file Copy path. Raw Blame History. Bento Bitsavers. Trakhtenbrot Algorithms and Complexity - Herbert S. Computer Science without a computer Data Structures - Prof.

Annotated Reference with Examples - G. Barnett and L. Mehlhorn et al. Bergmann Compiler Design: Models, Learning, and Inference - Simon J. Temporal Database Management - Christian S.

An Introduction Draft - Richard S. Sutton, Andrew G. Beezer Advanced Algebra - Anthony W. Grinstead and J. Downey Think Stats: Probability and Statistics for Programmers - Allen B. A Quickstart guide - Paul Swartout, Packt. Edward Lavieri, Packt. Pretty Darn Quick: Selected Essays of Richard M.

O'Reilly Books on Web Performance

Gabriel Open Advice: Downey Think OS: Demeyer, S. Ducasse and O.Ninety-Nine Scala Problems - Phil! However, in JDK 1.

It is not a programming recipe book by any stretch of imagination. Naturally, "too much" is different depending on the application,and the users of the application usually make this choice. Otherwise, you willnot know where to target your effort, how far you need to go, whether particular performancetargets are achievable at all, and how much tuning effort those targets may require.

Consequently, so long as you reduce the variation in response times so that the 90th percentile valueis smaller than before, you can actually increase the average response time, and the user will stillperceive the application as faster.

ETHAN from Concord
I am fond of studying docunments never . Also read my other posts. I have a variety of hobbies, like contesting.