Skip to content

Commit 7515e75

Browse files
committed
HDFS-11505. Do not enable any erasure coding policies by default. Contributed by Manoj Govindassamy.
1 parent 34424e9 commit 7515e75

32 files changed

+112
-16
lines changed

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -563,7 +563,7 @@ public class DFSConfigKeys extends CommonConfigurationKeys {
563563
"10m";
564564

565565
public static final String DFS_NAMENODE_EC_POLICIES_ENABLED_KEY = "dfs.namenode.ec.policies.enabled";
566-
public static final String DFS_NAMENODE_EC_POLICIES_ENABLED_DEFAULT = "RS-6-3-64k";
566+
public static final String DFS_NAMENODE_EC_POLICIES_ENABLED_DEFAULT = "";
567567
public static final String DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_THREADS_KEY = "dfs.datanode.ec.reconstruction.stripedread.threads";
568568
public static final int DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_THREADS_DEFAULT = 20;
569569
public static final String DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY = "dfs.datanode.ec.reconstruction.stripedread.buffer.size";

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ErasureCodingPolicyManager.java

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,12 +98,15 @@ public final class ErasureCodingPolicyManager {
9898
DFSConfigKeys.DFS_NAMENODE_EC_POLICIES_ENABLED_DEFAULT);
9999
this.enabledPoliciesByName = new TreeMap<>();
100100
for (String policyName : policyNames) {
101+
if (policyName.trim().isEmpty()) {
102+
continue;
103+
}
101104
ErasureCodingPolicy ecPolicy = SYSTEM_POLICIES_BY_NAME.get(policyName);
102105
if (ecPolicy == null) {
103106
String sysPolicies = Arrays.asList(SYS_POLICIES).stream()
104107
.map(ErasureCodingPolicy::getName)
105108
.collect(Collectors.joining(", "));
106-
String msg = String.format("EC policy %s specified at %s is not a " +
109+
String msg = String.format("EC policy '%s' specified at %s is not a " +
107110
"valid policy. Please choose from list of available policies: " +
108111
"[%s]",
109112
policyName,

hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2930,10 +2930,11 @@
29302930

29312931
<property>
29322932
<name>dfs.namenode.ec.policies.enabled</name>
2933-
<value>RS-6-3-64k</value>
2933+
<value></value>
29342934
<description>Comma-delimited list of enabled erasure coding policies.
29352935
The NameNode will enforce this when setting an erasure coding policy
2936-
on a directory.
2936+
on a directory. By default, none of the built-in erasure coding
2937+
policies are enabled.
29372938
</description>
29382939
</property>
29392940

hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ Architecture
6666

6767
Policies are named *codec*-*num data blocks*-*num parity blocks*-*cell size*. Currently, five built-in policies are supported: `RS-3-2-64k`, `RS-6-3-64k`, `RS-10-4-64k`, `RS-LEGACY-6-3-64k`, and `XOR-2-1-64k`.
6868

69-
By default, only `RS-6-3-64k` is enabled.
69+
By default, all built-in erasure coding policies are disabled.
7070

7171
Similar to HDFS storage policies, erasure coding policies are set on a directory. When a file is created, it inherits the EC policy of its nearest ancestor directory.
7272

@@ -91,15 +91,20 @@ Deployment
9191
Network bisection bandwidth is thus very important.
9292

9393
For rack fault-tolerance, it is also important to have at least as many racks as the configured EC stripe width.
94-
For the default EC policy of RS (6,3), this means minimally 9 racks, and ideally 10 or 11 to handle planned and unplanned outages.
94+
For EC policy RS (6,3), this means minimally 9 racks, and ideally 10 or 11 to handle planned and unplanned outages.
9595
For clusters with fewer racks than the stripe width, HDFS cannot maintain rack fault-tolerance, but will still attempt
9696
to spread a striped file across multiple nodes to preserve node-level fault-tolerance.
9797

9898
### Configuration keys
9999

100-
The set of enabled erasure coding policies can be configured on the NameNode via `dfs.namenode.ec.policies.enabled`. This restricts what EC policies can be set by clients. It does not affect the behavior of already set file or directory-level EC policies.
100+
The set of enabled erasure coding policies can be configured on the NameNode via `dfs.namenode.ec.policies.enabled` configuration. This restricts
101+
what EC policies can be set by clients. It does not affect the behavior of already set file or directory-level EC policies.
101102

102-
By default, only the `RS-6-3-64k` policy is enabled. Typically, the cluster administrator will configure the set of enabled policies based on the size of the cluster and the desired fault-tolerance properties. For instance, for a cluster with 9 racks, a policy like `RS-10-4-64k` will not preserve rack-level fault-tolerance, and `RS-6-3-64k` or `RS-3-2-64k` might be more appropriate. If the administrator only cares about node-level fault-tolerance, `RS-10-4-64k` would still be appropriate as long as there are at least 14 DataNodes in the cluster.
103+
By default, all built-in erasure coding policies are disabled. Typically, the cluster administrator will enable set of policies by including them
104+
in the `dfs .namenode.ec.policies.enabled` configuration based on the size of the cluster and the desired fault-tolerance properties. For instance,
105+
for a cluster with 9 racks, a policy like `RS-10-4-64k` will not preserve rack-level fault-tolerance, and `RS-6-3-64k` or `RS-3-2-64k` might
106+
be more appropriate. If the administrator only cares about node-level fault-tolerance, `RS-10-4-64k` would still be appropriate as long as
107+
there are at least 14 DataNodes in the cluster.
103108

104109
The codec implementation for Reed-Solomon and XOR can be configured with the following client and DataNode configuration keys:
105110
`io.erasurecode.codec.rs.rawcoder` for the default RS codec,

hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommissionWithStriped.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,8 @@ public void setup() throws IOException {
131131
conf.setInt(DFSConfigKeys.DFS_NAMENODE_REDUNDANCY_INTERVAL_SECONDS_KEY, 1);
132132
conf.setBoolean(DFSConfigKeys.DFS_NAMENODE_REDUNDANCY_CONSIDERLOAD_KEY,
133133
false);
134+
conf.set(DFSConfigKeys.DFS_NAMENODE_EC_POLICIES_ENABLED_KEY,
135+
StripedFileTestUtil.getDefaultECPolicy().getName());
134136

135137
numDNs = dataBlocks + parityBlocks + 2;
136138
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(numDNs).build();

hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestErasureCodeBenchmarkThroughput.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,8 @@ public static void setup() throws IOException {
4848
conf = new HdfsConfiguration();
4949
int numDN = ErasureCodeBenchmarkThroughput.getEcPolicy().getNumDataUnits() +
5050
ErasureCodeBenchmarkThroughput.getEcPolicy().getNumParityUnits();
51+
conf.set(DFSConfigKeys.DFS_NAMENODE_EC_POLICIES_ENABLED_KEY,
52+
ErasureCodeBenchmarkThroughput.getEcPolicy().getName());
5153
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(numDN).build();
5254
cluster.waitActive();
5355
fs = cluster.getFileSystem();

hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestErasureCodingPolicyWithSnapshot.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,8 @@ public class TestErasureCodingPolicyWithSnapshot {
4848
@Before
4949
public void setupCluster() throws IOException {
5050
conf = new HdfsConfiguration();
51+
conf.set(DFSConfigKeys.DFS_NAMENODE_EC_POLICIES_ENABLED_KEY,
52+
sysDefaultPolicy.getName());
5153
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(groupSize).build();
5254
cluster.waitActive();
5355
fs = cluster.getFileSystem();

hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileChecksum.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,8 @@ public void setup() throws IOException {
7777
conf.setBoolean(DFSConfigKeys.DFS_NAMENODE_REDUNDANCY_CONSIDERLOAD_KEY,
7878
false);
7979
conf.setInt(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MAX_STREAMS_KEY, 0);
80+
conf.set(DFSConfigKeys.DFS_NAMENODE_EC_POLICIES_ENABLED_KEY,
81+
StripedFileTestUtil.getDefaultECPolicy().getName());
8082
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(numDNs).build();
8183
Path ecPath = new Path(ecDir);
8284
cluster.getFileSystem().mkdir(ecPath, FsPermission.getDirDefault());

hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatusWithECPolicy.java

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,11 @@ public class TestFileStatusWithECPolicy {
4343

4444
@Before
4545
public void before() throws IOException {
46+
HdfsConfiguration conf = new HdfsConfiguration();
47+
conf.set(DFSConfigKeys.DFS_NAMENODE_EC_POLICIES_ENABLED_KEY,
48+
StripedFileTestUtil.getDefaultECPolicy().getName());
4649
cluster =
47-
new MiniDFSCluster.Builder(new Configuration()).numDataNodes(1).build();
50+
new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
4851
cluster.waitActive();
4952
fs = cluster.getFileSystem();
5053
client = fs.getClient();

hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecoveryStriped.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,8 @@ public void setup() throws IOException {
8888
false);
8989
conf.setInt(DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY, 1);
9090
conf.setInt(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MAX_STREAMS_KEY, 0);
91+
conf.set(DFSConfigKeys.DFS_NAMENODE_EC_POLICIES_ENABLED_KEY,
92+
ecPolicy.getName());
9193
final int numDNs = dataBlocks + parityBlocks;
9294
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(numDNs).build();
9395
cluster.waitActive();

0 commit comments

Comments
 (0)