Question: Assignment 2 Benchmarking the memory subsystem Deadline: Thursday February 18, 2016 1 Analysis How is the memory subsystem structured on your machine. On Linux system

Assignment 2 Benchmarking the memory subsystem Deadline: Thursday February 18, 2016 1 Analysis How is the memory subsystem structured on your machine. On Linux system likwid-topology (dev by Google) is a good tool to know how the memory system is architectured How much DRAM memory? Memory speed? Memory Technology How much bandwidth can your processor draw from the bus? You will need to discover width of the memory bus and clock speed of the memory bus. Full duplex/half duplex? (On intel processor, ARK often indicates something.) What is the highest level of cache? How big is it? How is it shared across core? Same question for all levels of cache. 2 Bandwidth The purpose of this section is to measure the maximum memory bandwidth of the different components of the system. The easiest way to ensure that is to do the minimum amount of arithmetic operation per byte of data. For each level of the memory hierarchy, measure read, write and read/write bandwidth. The easiest way of doing this is to have each core do a measurable number of memory transfer on a piece of data of a particular size. Plot each bandwidth as a function of the size of the data each core work on. To measure read bandwidth, the easiest test is often to simply compute the sum of an array. To measure write bandwidth, the easiest test is often to set a memory region to zero. To measure read/write bandwidth, the easiest test is often to copy an array in an other one. The measurement itself can be an issue because of the fill-in (what happens at the beginning) and flushout (what happens at the end). A reasonable way to measure is to loop over your main operation multiple time and time only the middle loop iterations to make sure the measurement are carried out while all the cores are busy. 3 Latency The best way to measure memory latency is to perform memory operations that are not easily predicted. Linked list are certainly king in that context. Write an element-less singly linked list. (To be clear it is simply an array of integer next that refers to itself so that you traverse it by doing current = next[current];.) For different size of the list, measure the time it takes to follow a large number of links and report the time per link followed. You will need to set the list to see what you want to see. 1 the instruction cost of the traversal by setting the list to be core bound. the latency by shuffling the order of the list, but be careful not to build a short cycle. the TLB latency cost by ensuring that you are jumping memory pages every time in a hard to predict fashion. 4 Help 4.1 Make sure bandwidth is memory bound Look at the assembly to ensure the it is mostly IO. You can estimate a sufficient arithmetic to IO instruction by looking at flops/bandwidth ratio. 4.2 alignment Be wary of of alignment. Look at the difference between mm256 load ps and mm256 loadu ps. 4.3 Other interesting things mm256 stream load si256 makes a non-temporal read (wont be cached)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Assignment 2: Benchmarking \& SWOT Analysis Assignment Due March 12 at 11:59 PM Starts Feb 14, 2023 12:01 AM Ends Mar 12, 2023 11:59 PM Instructions This is an individual Assignment. Assessing and...

Two otherwise identical memory systems, MS and MS2, have slightly different cache configurations. Both systems have an 8MB memory that is byte-addressable, and a unified single-level cache that is...

Study Guide Strategic Business Management By A. J. Cataldo About the Author A. J. Cataldo is currently a professor of accounting at West Chester University, in West Chester, Pennsylvania. He holds a...

Study Guide Healthcare Statistics By Jacqueline K. Wilson, RHIA About the Author Jacqueline K. Wilson is a Registered Health Information Administrator (RHIA) who has more than ten years of experience...

can you please answer this question like my professor wants me to and I use windows (core i7 10th gen) BTM 200 - Fundamentals of Information Technology Assignment 2 Instructions A number of tools are...

For macbook air, please answer as to be submitted directly with clear steps. Thank you. BTM 200 - Fundamentals of Information Technology Assignment 2 Instructions A number of tools are available to...

Comp 222 Computer Organization Assignment 2: Cache Dr. George Lazik Page 1 of 9 Revision B: 10/14/18 Programming Assignment 2 Cache Memory Objective: To write a C program that simulates reading and...

Assignment #3: ADT List implemented as Linked Lists in an Array Assignment due with Brightspace at 11:50pm on Wednesday February 3 Read in the textbook chapters 1 (skip $1.3, 61.5), chapter 5 (skip...

CSV data with flat schema with multiple records and features RecordNo Invoice StockCode Description Quantity InvoiceDate Price CustomerID Country 4 5 2 3 0 C 4 9 3 4 1 1 2 1 5 3 9 RETRO SPOTS BUTTER...

C++, need StringSet.h and StringSet.cpp. I need a test file aswell here is my implementation so far StringSet.h using namespace std; #include #include class StringSet { public: // Constructor and...

Find the mass in kilograms of 7.50 x 1024 atoms of arsenic, which has a molar mass of 74.9 g/mol.

Franklin Restaurant Group operates a chain of gourmet sandwich shops. The company is considering two possible expansion plans. Plan A would involve opening eight smaller shops at a cost of $...

hat is the most appropriate approach to assessing financial factors as part of an integrated analysis? Interview management to assess their financial knowledge, skills, and background. Begin by...

A 4.000-mL urine sample from a person suffering from diabetes mellitus has a mass of 4.003g. Express your answer using four significant figures

From a Comparable Worth Standpoint, what is the situation with regard to Federal Gender-based Employee Pay Equity?

Provide an example of how drilling down further into information can yield new results.

What do Dimensions represent in OLAP Cubes?