Discussion:
[PATCH 0/3] MIPS/GAS: Extend the R5900 short loop fix test and notes
Fredrik Noring
2018-09-30 09:29:32 UTC
Permalink
Hi,

This set of changes clarifies the conditions for the R5900 short loop fix
and extends its test with the border cases of six and seven instructions.

Fredrik

Fredrik Noring (3):
MIPS/GAS/testsuite: Extend the R5900 short loop fix test with border cases
MIPS/GAS: Clarify the R5900 short loop hardware bug conditions
MIPS/GAS: Correct note on the instruction count for the R5900 short loop fix

gas/config/tc-mips.c | 20 ++++++++++++++++----
gas/testsuite/gas/mips/r5900.d | 18 +++++++++++++++++-
gas/testsuite/gas/mips/r5900.s | 32 ++++++++++++++++++++++++++++----
3 files changed, 61 insertions(+), 9 deletions(-)
--
2.16.4
Fredrik Noring
2018-09-14 14:41:53 UTC
Permalink
The R5900 short loop fix applies up to and including six loop instructions
including the branch and the delay slot. This change extends its test with
the border cases of six and seven instructions. For the latter case, the
fix does not apply, which is now also verified.
---
gas/testsuite/gas/mips/r5900.d | 18 +++++++++++++++++-
gas/testsuite/gas/mips/r5900.s | 32 ++++++++++++++++++++++++++++----
2 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/gas/testsuite/gas/mips/r5900.d b/gas/testsuite/gas/mips/r5900.d
index 7ef9a8a96e..082c204a6f 100644
--- a/gas/testsuite/gas/mips/r5900.d
+++ b/gas/testsuite/gas/mips/r5900.d
@@ -87,7 +87,23 @@ Disassembly of section \.text:
[0-9a-f]+ <[^>]*> 2403012c li \$3,300
[0-9a-f]+ <[^>]*> 2063ffff addi \$3,\$3,-1
[0-9a-f]+ <[^>]*> 2084ffff addi \$4,\$4,-1
-[0-9a-f]+ <[^>]*> 1460fffd bnez \$3,[0-9a-f]+ <short_loop1>
+[0-9a-f]+ <[^>]*> 1460fffd bnez \$3,[0-9a-f]+ <short_loop3>
[0-9a-f]+ <[^>]*> 00000000 nop
+[0-9a-f]+ <[^>]*> 2403012c li \$3,300
+[0-9a-f]+ <[^>]*> 2063ffff addi \$3,\$3,-1
+[0-9a-f]+ <[^>]*> 2084ffff addi \$4,\$4,-1
+[0-9a-f]+ <[^>]*> 20a5ffff addi \$5,\$5,-1
+[0-9a-f]+ <[^>]*> 20c6ffff addi \$6,\$6,-1
+[0-9a-f]+ <[^>]*> 20e7ffff addi \$7,\$7,-1
+[0-9a-f]+ <[^>]*> 1460fffa bnez \$3,[0-9a-f]+ <short_loop6>
+[0-9a-f]+ <[^>]*> 00000000 nop
+[0-9a-f]+ <[^>]*> 2403012c li \$3,300
+[0-9a-f]+ <[^>]*> 2063ffff addi \$3,\$3,-1
+[0-9a-f]+ <[^>]*> 2084ffff addi \$4,\$4,-1
+[0-9a-f]+ <[^>]*> 20a5ffff addi \$5,\$5,-1
+[0-9a-f]+ <[^>]*> 20c6ffff addi \$6,\$6,-1
+[0-9a-f]+ <[^>]*> 20e7ffff addi \$7,\$7,-1
+[0-9a-f]+ <[^>]*> 1460fffa bnez \$3,[0-9a-f]+ <short_loop7>
+[0-9a-f]+ <[^>]*> 2108ffff addi \$8,\$8,-1
[0-9a-f]+ <[^>]*> 24040003 li \$4,3
\.\.\.
diff --git a/gas/testsuite/gas/mips/r5900.s b/gas/testsuite/gas/mips/r5900.s
index 3a16e28f28..9d16b25ece 100644
--- a/gas/testsuite/gas/mips/r5900.s
+++ b/gas/testsuite/gas/mips/r5900.s
@@ -120,13 +120,37 @@ stuff:
.set pop
.set push
.set reorder
- # Short loop fix.
+ # Test the short loop fix with 3 loop instructions.
li $3, 300
-short_loop1:
+short_loop3:
addi $3, -1
addi $4, -1
- # NOP should be inserted in branch delay.
- bne $3, $0, short_loop1
+ # A NOP will be inserted in the branch delay slot.
+ bne $3, $0, short_loop3
+
+ # Test the short loop fix with 6 loop instructions.
+ li $3, 300
+short_loop6:
+ addi $3, -1
+ addi $4, -1
+ addi $5, -1
+ addi $6, -1
+ addi $7, -1
+ # A NOP will be inserted in the branch delay slot.
+ bne $3, $0, short_loop6
+
+ # Test the short loop fix with 7 loop instructions.
+ li $3, 300
+short_loop7:
+ addi $3, -1
+ addi $4, -1
+ addi $5, -1
+ addi $6, -1
+ addi $7, -1
+ addi $8, -1
+ # The short loop fix does not apply for loops with
+ # more than 6 instructions.
+ bne $3, $0, short_loop7

li $4, 3
.set pop
--
2.16.4
Fredrik Noring
2018-09-08 18:01:59 UTC
Permalink
This note was originally stated in gcc-2.95.2/gcc/config/mips/mips.c
in the Sony Linux kit for the PlayStation 2. It collects, clarifies
and details the bug conditions that are tested for in the code below.
---
gas/config/tc-mips.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/gas/config/tc-mips.c b/gas/config/tc-mips.c
index c9fc6c6ec1..684f2bdfa5 100644
--- a/gas/config/tc-mips.c
+++ b/gas/config/tc-mips.c
@@ -6982,9 +6982,21 @@ can_swap_branch_p (struct mips_cl_insn *ip, expressionS *address_expr,
&& insn_length (history) != 4)
return FALSE;

- /* On R5900 short loops need to be fixed by inserting a nop in
- the branch delay slots.
- A short loop can be terminated too early. */
+ /* On R5900 short loops need to be fixed by inserting a NOP in the branch
+ delay slots.
+
+ The short loop bug under certain conditions causes loops to execute
+ only once or twice. We must ensure that the assembler never generates
+ loops that satisfy all of the following conditions:
+
+ - a loop consists of less than or equal to six instructions (including
+ the branch delay slot);
+ - a loop contains only one conditional branch instruction at the end
+ of the loop;
+ - a loop does not contain any other branch or jump instructions;
+ - a branch delay slot of the loop is not NOP (EE 2.9 or later).
+
+ We need to do this because of a hardware bug in the R5900 chip. */
if (mips_opts.arch == CPU_R5900
/* Check if instruction has a parameter, ignore "j $31". */
&& (address_expr != NULL)
--
2.16.4
Fredrik Noring
2018-09-30 07:41:08 UTC
Permalink
The code applying the short loop fix, its test cases, and the original
note by Sony confirm that the correct instruction count including the
branch and the delay slot is six instructions or fewer.

This change amends the note in the code that (incorrectly) says that
the count is strictly less than six instructions.
---
gas/config/tc-mips.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gas/config/tc-mips.c b/gas/config/tc-mips.c
index 684f2bdfa5..0e898dd2d3 100644
--- a/gas/config/tc-mips.c
+++ b/gas/config/tc-mips.c
@@ -7014,7 +7014,7 @@ can_swap_branch_p (struct mips_cl_insn *ip, expressionS *address_expr,
|| (ip->insn_opcode & 0xffff0000) == 0x04110000)) /* bgezal $0 */
{
int distance;
- /* Check if loop is shorter than 6 instructions including
+ /* Check if loop is shorter than or equal to 6 instructions including
branch and delay slot. */
distance = frag_now_fix () - S_GET_VALUE (address_expr->X_add_symbol);
if (distance <= 20)
--
2.16.4
Maciej W. Rozycki
2018-09-30 14:01:53 UTC
Permalink
Hi Fredrik,
Post by Fredrik Noring
This set of changes clarifies the conditions for the R5900 short loop fix
and extends its test with the border cases of six and seven instructions.
Thank you for your contribution. These changes look OK to me, but can
you please reformat comments and change descriptions to fit in 74 columns
(as per
<https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards#Formatting_Your_Code>;
this is from our sister project's wiki as we haven't grown our own one
yet, but the projects share both code and the repository, and the rules
are roughly the same).

Also this is your first submission and these changes are substantial
enough to require a copyright assignment or a similar arrangement with
FSF. Can you please clarify if you have one in place?

Maciej
Fredrik Noring
2018-09-30 18:15:51 UTC
Permalink
Thanks for your review, Maciej,
Post by Maciej W. Rozycki
Thank you for your contribution. These changes look OK to me, but can
you please reformat comments and change descriptions to fit in 74 columns
(as per
<https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards#Formatting_Your_Code>;
this is from our sister project's wiki as we haven't grown our own one
yet, but the projects share both code and the repository, and the rules
are roughly the same).
Sure, I'll look into that!
Post by Maciej W. Rozycki
Also this is your first submission and these changes are substantial
enough to require a copyright assignment or a similar arrangement with
FSF. Can you please clarify if you have one in place?
I have not (yet) arranged anything with the FSF. By

https://www.gnu.org/prep/maintain/html_node/Copyright-Papers.html

it seems that I should contact ***@gnu.org and then mail a signed form
to the FSF via postal mail, right?

Fredrik
Maciej W. Rozycki
2018-09-30 19:08:34 UTC
Permalink
Hi Nick,
Post by Fredrik Noring
Post by Maciej W. Rozycki
Also this is your first submission and these changes are substantial
enough to require a copyright assignment or a similar arrangement with
FSF. Can you please clarify if you have one in place?
I have not (yet) arranged anything with the FSF. By
https://www.gnu.org/prep/maintain/html_node/Copyright-Papers.html
to the FSF via postal mail, right?
Can you please guide Fredrik with our process?

Maciej
Nick Clifton
2018-10-01 10:01:19 UTC
Permalink
Hi Maciej, Hi Fredrik,
Post by Maciej W. Rozycki
Can you please guide Fredrik with our process?
Sure. (Sorry - I have been v busy with non-binutils stuff this last week).
Post by Maciej W. Rozycki
Post by Fredrik Noring
I have not (yet) arranged anything with the FSF. By
https://www.gnu.org/prep/maintain/html_node/Copyright-Papers.html
to the FSF via postal mail, right?
Actually these days you can do most of it via email.

First you need to choose the appropriate form to fill out,
depending upon whether you want to create a one off assignment
for a single contribution, or an ongoing assignment for multiple
patches. Plus you need to choose which package(s) you will
want to assign to (or all of them if you want).

Have a look at the files in the doc/Copyright directory of the
Gnulib project:

http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=tree;f=doc/Copyright;h=22db9a802b4da96ad455cd933351c1359108f95d;hb=HEAD

The one that most contributors use is request-assign.future.

Read it, fill in the details and email it off to the FSF. They
should take things from there.

Cheers
Nick
Fredrik Noring
2018-10-11 16:15:35 UTC
Permalink
Hi Maciej,
Post by Maciej W. Rozycki
Thank you for your contribution. These changes look OK to me, but can
you please reformat comments and change descriptions to fit in 74 columns
Done for v2, which I will post shortly.
Post by Maciej W. Rozycki
Also this is your first submission and these changes are substantial
enough to require a copyright assignment or a similar arrangement with
FSF. Can you please clarify if you have one in place?
I am waiting for the FSF to reply.

Fredrik
Fredrik Noring
2018-10-11 16:25:06 UTC
Permalink
Hi,

This set of changes clarifies the conditions for the R5900 short loop fix
and extends its test with the border cases of six and seven instructions.

Fredrik

Changes in v2:
- Reformat to fit 74 columns

Fredrik Noring (3):
MIPS/GAS/testsuite: Extend the R5900 short loop fix test with border
cases
MIPS/GAS: Clarify the R5900 short loop hardware bug conditions
MIPS/GAS: Correct note on the R5900 instruction count short loop fix

gas/config/tc-mips.c | 22 +++++++++++++++++-----
gas/testsuite/gas/mips/r5900.d | 18 +++++++++++++++++-
gas/testsuite/gas/mips/r5900.s | 32 ++++++++++++++++++++++++++++----
3 files changed, 62 insertions(+), 10 deletions(-)
--
2.16.4
Fredrik Noring
2018-10-11 16:25:40 UTC
Permalink
The R5900 short loop fix applies up to and including six loop
instructions including the branch and the delay slot. This change
extends its test with the border cases of six and seven instructions.
For the latter case, the fix does not apply, which is now also verified.
---
gas/testsuite/gas/mips/r5900.d | 18 +++++++++++++++++-
gas/testsuite/gas/mips/r5900.s | 32 ++++++++++++++++++++++++++++----
2 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/gas/testsuite/gas/mips/r5900.d b/gas/testsuite/gas/mips/r5900.d
index 7ef9a8a96e..082c204a6f 100644
--- a/gas/testsuite/gas/mips/r5900.d
+++ b/gas/testsuite/gas/mips/r5900.d
@@ -87,7 +87,23 @@ Disassembly of section \.text:
[0-9a-f]+ <[^>]*> 2403012c li \$3,300
[0-9a-f]+ <[^>]*> 2063ffff addi \$3,\$3,-1
[0-9a-f]+ <[^>]*> 2084ffff addi \$4,\$4,-1
-[0-9a-f]+ <[^>]*> 1460fffd bnez \$3,[0-9a-f]+ <short_loop1>
+[0-9a-f]+ <[^>]*> 1460fffd bnez \$3,[0-9a-f]+ <short_loop3>
[0-9a-f]+ <[^>]*> 00000000 nop
+[0-9a-f]+ <[^>]*> 2403012c li \$3,300
+[0-9a-f]+ <[^>]*> 2063ffff addi \$3,\$3,-1
+[0-9a-f]+ <[^>]*> 2084ffff addi \$4,\$4,-1
+[0-9a-f]+ <[^>]*> 20a5ffff addi \$5,\$5,-1
+[0-9a-f]+ <[^>]*> 20c6ffff addi \$6,\$6,-1
+[0-9a-f]+ <[^>]*> 20e7ffff addi \$7,\$7,-1
+[0-9a-f]+ <[^>]*> 1460fffa bnez \$3,[0-9a-f]+ <short_loop6>
+[0-9a-f]+ <[^>]*> 00000000 nop
+[0-9a-f]+ <[^>]*> 2403012c li \$3,300
+[0-9a-f]+ <[^>]*> 2063ffff addi \$3,\$3,-1
+[0-9a-f]+ <[^>]*> 2084ffff addi \$4,\$4,-1
+[0-9a-f]+ <[^>]*> 20a5ffff addi \$5,\$5,-1
+[0-9a-f]+ <[^>]*> 20c6ffff addi \$6,\$6,-1
+[0-9a-f]+ <[^>]*> 20e7ffff addi \$7,\$7,-1
+[0-9a-f]+ <[^>]*> 1460fffa bnez \$3,[0-9a-f]+ <short_loop7>
+[0-9a-f]+ <[^>]*> 2108ffff addi \$8,\$8,-1
[0-9a-f]+ <[^>]*> 24040003 li \$4,3
\.\.\.
diff --git a/gas/testsuite/gas/mips/r5900.s b/gas/testsuite/gas/mips/r5900.s
index 3a16e28f28..9d16b25ece 100644
--- a/gas/testsuite/gas/mips/r5900.s
+++ b/gas/testsuite/gas/mips/r5900.s
@@ -120,13 +120,37 @@ stuff:
.set pop
.set push
.set reorder
- # Short loop fix.
+ # Test the short loop fix with 3 loop instructions.
li $3, 300
-short_loop1:
+short_loop3:
addi $3, -1
addi $4, -1
- # NOP should be inserted in branch delay.
- bne $3, $0, short_loop1
+ # A NOP will be inserted in the branch delay slot.
+ bne $3, $0, short_loop3
+
+ # Test the short loop fix with 6 loop instructions.
+ li $3, 300
+short_loop6:
+ addi $3, -1
+ addi $4, -1
+ addi $5, -1
+ addi $6, -1
+ addi $7, -1
+ # A NOP will be inserted in the branch delay slot.
+ bne $3, $0, short_loop6
+
+ # Test the short loop fix with 7 loop instructions.
+ li $3, 300
+short_loop7:
+ addi $3, -1
+ addi $4, -1
+ addi $5, -1
+ addi $6, -1
+ addi $7, -1
+ addi $8, -1
+ # The short loop fix does not apply for loops with
+ # more than 6 instructions.
+ bne $3, $0, short_loop7

li $4, 3
.set pop
--
2.16.4
Fredrik Noring
2018-10-11 16:26:03 UTC
Permalink
This note was originally stated in gcc-2.95.2/gcc/config/mips/mips.c
in the Sony Linux kit for the PlayStation 2. It collects, clarifies
and details the bug conditions that are tested for in the code below.
---
gas/config/tc-mips.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/gas/config/tc-mips.c b/gas/config/tc-mips.c
index c9fc6c6ec1..cab1c4c6c8 100644
--- a/gas/config/tc-mips.c
+++ b/gas/config/tc-mips.c
@@ -6982,9 +6982,21 @@ can_swap_branch_p (struct mips_cl_insn *ip, expressionS *address_expr,
&& insn_length (history) != 4)
return FALSE;

- /* On R5900 short loops need to be fixed by inserting a nop in
- the branch delay slots.
- A short loop can be terminated too early. */
+ /* On the R5900 short loops need to be fixed by inserting a NOP in the
+ branch delay slot.
+
+ The short loop bug under certain conditions causes loops to execute
+ only once or twice. We must ensure that the assembler never
+ generates loops that satisfy all of the following conditions:
+
+ - a loop consists of less than or equal to six instructions
+ (including the branch delay slot);
+ - a loop contains only one conditional branch instruction at the end
+ of the loop;
+ - a loop does not contain any other branch or jump instructions;
+ - a branch delay slot of the loop is not NOP (EE 2.9 or later).
+
+ We need to do this because of a hardware bug in the R5900 chip. */
if (mips_opts.arch == CPU_R5900
/* Check if instruction has a parameter, ignore "j $31". */
&& (address_expr != NULL)
--
2.16.4
Fredrik Noring
2018-10-11 16:26:28 UTC
Permalink
The code applying the short loop fix, its test cases, and the original
note by Sony confirm that the correct instruction count including the
branch and the delay slot is six instructions or fewer.

This change amends the note in the code that (incorrectly) says that
the count is strictly less than six instructions.
---
gas/config/tc-mips.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gas/config/tc-mips.c b/gas/config/tc-mips.c
index cab1c4c6c8..918525b4e9 100644
--- a/gas/config/tc-mips.c
+++ b/gas/config/tc-mips.c
@@ -7014,8 +7014,8 @@ can_swap_branch_p (struct mips_cl_insn *ip, expressionS *address_expr,
|| (ip->insn_opcode & 0xffff0000) == 0x04110000)) /* bgezal $0 */
{
int distance;
- /* Check if loop is shorter than 6 instructions including
- branch and delay slot. */
+ /* Check if loop is shorter than or equal to 6 instructions
+ including branch and delay slot. */
distance = frag_now_fix () - S_GET_VALUE (address_expr->X_add_symbol);
if (distance <= 20)
{
--
2.16.4
Nick Clifton
2018-10-16 16:20:51 UTC
Permalink
Hi Fredrik,
Post by Fredrik Noring
This set of changes clarifies the conditions for the R5900 short loop fix
and extends its test with the border cases of six and seven instructions.
Patch series approved. Do you need me to check it in for you ?

Cheers
Nick
Fredrik Noring
2018-10-16 17:05:11 UTC
Permalink
Hi Nick,
Post by Nick Clifton
Patch series approved. Do you need me to check it in for you ?
Yes, please. I'm still waiting for the FSF to sort out and reply with the
paperwork, though.

Fredrik
Maciej W. Rozycki
2018-10-16 17:15:40 UTC
Permalink
Post by Fredrik Noring
Post by Nick Clifton
Patch series approved. Do you need me to check it in for you ?
Yes, please. I'm still waiting for the FSF to sort out and reply with the
paperwork, though.
I was going to push your changes once you've got the paperwork sorted
out, but if Nick is happy to do that, then I won't get in the way.

Maciej
Nick Clifton
2018-10-19 08:51:52 UTC
Permalink
Hi Guys,

I have now committed the patch series.
Post by Maciej W. Rozycki
I was going to push your changes once you've got the paperwork sorted
out, but if Nick is happy to do that, then I won't get in the way.
I think that the patch series is simple enough that it counts as "obvious".

Cheers
Nick
Maciej W. Rozycki
2018-10-19 16:22:23 UTC
Permalink
Hi Nick,
Post by Nick Clifton
I have now committed the patch series.
Thanks for making it one item less for me to do.
Post by Nick Clifton
Post by Maciej W. Rozycki
I was going to push your changes once you've got the paperwork sorted
out, but if Nick is happy to do that, then I won't get in the way.
I think that the patch series is simple enough that it counts as "obvious".
Except for the usual coding standard nits -- which is why in the first
round I asked Fredrik to send this v2 of the series.

Maciej

Loading...