Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
AMD AI

AMD Introduces Radeon Instinct Machine Intelligence Accelerators (hothardware.com) 55

Reader MojoKid writes: AMD is announcing a new series of Radeon-branded products today, targeted at machine intelligence and deep learning enterprise applications, called Radeon Instinct. As its name suggests, the new Radeon Instinct line of products are comprised of GPU-based solutions for deep learning, inference and training. The new GPUs are also complemented by a free, open-source library and framework for GPU accelerators, dubbed MIOpen. MIOpen is architected for high-performance machine intelligence applications and is optimized for the deep learning frameworks in AMD's ROCm software suite. The first products in the lineup consist of the Radeon Instinct MI6, the MI8, and the MI25. The 150W Radeon Instinct MI6 accelerator is powered by a Polaris-based GPU, packs 16GB of memory (224GB/s peak bandwidth), and will offer up to 5.7 TFLOPS of peak FP16 performance. Next up in the stack is the Fiji-based Radeon Instinct MI8. Like the Radeon R9 Nano, the Radeon Instinct MI8 features 4GB of High-Bandwidth Memory (HBM) with peak bandwidth of 512GB/s. The MI8 will offer up to 8.2 TFLOPS of peak FP16 compute performance, with a board power that typical falls below 175W. The Radeon Instinct MI25 accelerator will leverage AMD's next-generation Vega GPU architecture and has a board power of approximately 300W. All of the Radeon Instinct accelerators are passively cooled but when installed into a server chassis you can bet there will be plenty of air flow. Like the recently released Radeon Pro WX series of professional graphics cards for workstations, Radeon Instinct accelerators will be built by AMD. All of the Radeon Instinct cards will also support AMD MultiGPU (MxGPU) hardware virtualization technology.
This discussion has been archived. No new comments can be posted.

AMD Introduces Radeon Instinct Machine Intelligence Accelerators

Comments Filter:
  • Every time I see "16 GB of memory" on a GPU card, I have to ask the same question... Is all 16GB addressable? I've never been 'not' disappointed before.
    • Every time I see "16 GB of memory" on a GPU card, I have to ask the same question... Is all 16GB addressable?

      As opposed to RAM that's put on a video card but isn't addressable, so that all it does is waste space and power?

  • So they're all excited about the lowest-precision, smallest-size floating point math in IEEE 754?

    Not only that, but FP16 is intended for storage (of many floating-point values where higher precision need not be stored), not for performing arithmetic computations. [wikipedia.org]

    Kudos to AMD's marketing department for boasting about their compute performance with a number format that was never meant for computation.

    Tell them to get back to me with their 64, 128, and 256-bit IEEE floating point performance..

    • Storage is technically a floating point operation!

      — AMD'S marketing department

    • by ShanghaiBill ( 739463 ) on Monday December 12, 2016 @01:42PM (#53469955)

      So they're all excited about the lowest-precision, smallest-size floating point math in IEEE 754?

      FP16 is good enough for neural nets. Do you really think the output voltage of a biological neurons has 32 bits of precision and range? For any given speed, FP16 allows you to run NNs that are wider and deeper, and/or to use bigger datasets That is way more important than the precision of individual operations.

      • Neural networks have a shaky biological basis at best. More pragmatically, they are a network of perceptrons with sigmoidal output functions. In that cases, yes, more bits of precision can be very relevant. Once you start talking about a deep learning network the updates to individual perceptrons can be very small and 32 bits are needed.
        • by Anonymous Coward

          Actually, since perceptrons can't do non-linear separable problems, it might be more accurate to call them non-linear multiple regression optimizers. Although, since gradient descent backpropogation is based on a difference of squares, it might be even more accurate to call them non-linear least squares multiple regression optimizers. But then, since they do non-linear regression with zillions terms, they are really arbitrary function approximators, so it might be even more accurate to call them non-linear

        • More pragmatically, they are a network of perceptrons with sigmoidal output functions.

          Today, most bleeding edge NNs use rectified linear activation functions [wikipedia.org]. Sigmoids are soooo 2014.

          Once you start talking about a deep learning network the updates to individual perceptrons can be very small and 32 bits are needed.

          You can get the same flexibility and more just by going wider and deeper. The bottleneck for NNs is not the math ops, but getting data in and out of the GPU. By using FP16, you cut the per-neuron data in half.

      • Accidentally posted as anonymous coward, reposting under my actual name.

        So they're all excited about the lowest-precision, smallest-size floating point math in IEEE 754?

        FP16 is good enough for neural nets. Do you really think the output voltage of a biological neurons has 32 bits of precision and range? For any given speed, FP16 allows you to run NNs that are wider and deeper, and/or to use bigger datasets That is way more important than the precision of individual operations.

        There's a lot of rounding error with FP16. The neural networks I use are 16-bit integers, which work much, much better, at least for the work I'm doing. Also, do you have a good citation that FP16 neural networks are, overall, more effective than FP32 networks, as you've described?

        • There's a lot of rounding error with FP16.

          Sure, but it doesn't matter. Backprop, learning rate, denoising, etc. all just heuristics anyway. So what if your mantissa is off by one bit? You get better accuracy by going wider, adding layers, and (most importantly) using more data. But you can't afford to do that if half your bandwidth is sucked up transmitting meaningless precision.

          Also, do you have a good citation that FP16 neural networks are, overall, more effective than FP32 networks, as you've described?

          They are not necessarily more effective, just more efficient. If you have infinite resources, you might even get better results using FP32. But resources are never inf

          • So, one problem is that there is not always more data. In my field, we have a surplus of some sorts of data, but other data requires hundreds of thousands of hours of human input, and we only have so much of that to go around. Processing all of that is easy enough, getting more is not.

            Also, by "effective", I should have made it clear that I meant "an effective overall solution to the problem", which includes all costs of training a wider, lower-precision network. This includes input data collection, stora

      • Do you really think the output voltage of a biological neurons has 32 bits of precision and range?

        ...What? It's analog. It's got precision going down into the quantum scale... You know depending on noise. Range is also a big issue. But it's leveraging real-world physics to compute things. Think about how many discreet binary operations you'd have to perform to calculate the weighted middle point between populated cities. With an analog "computer"(it's a board with holes, a bit of string with some rocks on the end) it's "computation" is done practically instantly when you lift the thing up and gravity p

        • ...What? It's analog. It's got precision going down into the quantum scale...

          That is not true in any meaningful sense. If you give the same inputs to the same biological neuron, there is no way that you are going to get the same output down to the planck scale. In fact, it is unlikely that you are even going to get 8 bit precision (an output difference of 1/256th).

          • Wow, it's like the meaningful usage of a biological neuron depends on how much noise there is in the system.

            But I think you forget the subject matter. It doesn't matter if the neuron fires 10% early 20% of the time. It's a real-world genetic algorithm system. That's just a feature the GA gets to play with. Because it truly doesn't care about getting exact answers, only good enough to balance a shmuck on two legs... most of the time.

            Jesus, meat-space is just different. Comparing the two is going to run into

          • What I wanted to reply to the parent : it's not like a one dimensional analog signal either. This leaves out chemicals and finer details of what's happening in dendrites and axon and whatever stuff I can't name.
            The idea you can map out the high level electrical brain and only that, and get a brain is a fallacy. It's like we're stuck in the late 90s and Ray Kurweil's ideas of the brain ; I reckon it's the main limit to transhumanism or singularity philosophies. Computer neural networks do have their own uses

    • by Kartu ( 1490911 )

      This is aimed at "deep learning" and 16 bit is just what they need.

Computer programmers do it byte by byte.

Working...